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(54) Tide: REGULATORY ZINC FINGER PROTEINS 

GAATTCTGTGCCCTCACTCCCCTGGATCCCTGGGCAAAGCCCCAGAGGGAAACACAAACAGGTTGTTGTA 
ACACACCTTGCTGGGTACCACCATGGAGGACAGfTGGCTTATGGGGGTGGGGGGTGCCTGGGGCCACGGA 
GTGACTGGTGATGGCTATCCCTCCTTGGAACCCGTCCAGCCTCCTCTTAGCTTCAGATTTGTTTATTTGT 
TTTTTACTAAGACCTGCTCTTTCAGGTCTGTTGGCTCTTTTAGGGGCTGAAGAAGGCCGAGTTGAG7\AGG 
GATGCAAGGGAGGGGGCCAGAATGAGCCCTTAGGGCTCAGAGCCTCCATCCTGCCCCAAGATGTCTACAG 
CTTGTGCTCCTGGGGTGCTAGAGGCGCACAAGGAGGAAAGTTAGTGGCTTCCCTTCCATATCCCGTTCAT 
CAGCCTAGAGCATGGAGCCCAGGTGAGGAGGCCTGCCTGGGAGGGGGCCCTGAGCCAGGAAATAAACATT 
TACTAACTGTACAAAGACCTTGTCCCTGCTGCTGGGGAGCCTGCCAAGTGGTGGAGACAGGACTAGTGCA 
CGAATGATGGAAAGGGAGGGTTGGGGTGGGTGGGAGCCAGCCCTTTTCCTCATAAGGGCCTTAGGACACC 
ATACCGATGGAACTGGGGGTACTGGGGAGGTAACCTAGCACCTCCACCAAACCACAGCAACATGTGCTGA 
J GGATGGGGCTGACTAGGTAAGCTCCCTGGAGCGTTTTGGTTAAATTGAGGGAAATTGCTGCATTCCCATT 
< CTCAGTCCATGCCTCCACAGAGGCTATGCCAGCllGTAGGCCAGACCCTGGCAAGATCTGGGTGGATAATC 
) AGACTGACTGGCCTCAGAGCCCCAACTTTGTTCCCTGGGGCAGCCTGGAAATAGCCAGGTCAGAAACCAG C" 

CCAGGAATTTTTCCAAGCTGCTTCCTATATGCAAGAATGGGATGGGGGCCTTTGGGAGCACTTAGGGAAG ^ 
ATGTGGAGAGTTGGAGGAAAAGGGGGCTTGGAGdTAAGGGAGGGGACTGGGGGAAGGATAGGGGAGAAGC "1 
TGTGAGCCTGGAGAAGTAGCC/^AGGGATCCTGAQGGAATGGGGGAGCTGAGACGAAACCCCCATTTCTAT 
TCAGAAGATGAGCTATGAGTCTGGGCTTGGGCTGATAGAAGCCTTGGCCCCTGGCCTGGTGGGAGCTCTG 
GGCAGCTGGCCTACAGACGTTCCTTAGTGCTGGCGGGTAGGTTTGAATCATCACGCAGGCCCTGGCCTCC 
ACCCGCCCCCACCAGCCCCCTGGCCTCAGTTCCGTGGCAACATCTGGGGTTGGGGGGGCAGCAGGAACAA 
GGGCCTCTGTCTGCCCAGCTGCCTCCCCCTTTGG.GTTTTGCCAGACTCCACAGTGCATACGTGGGCTCCA 

(57) Abstract: Disclosed are chimeric zinc finger proteins that can regulate endogenous genes. Examples of such proteins include 
proteins that can regulate VEGF-A expression. The proteins and nucleic acid encoding them can be used to modulate angiogenesis. 
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REGULATORY ZINC FINGER PROTEINS 

CROSS-REFERENCE TO RELATED APPLICATIONS 

This application claims priority to U.S. Application Serial No. 60/43 1,892, filed on 
December 9, 2002, the contents of which are incorporated by reference herein. 

5 TECHNICAL FIELD 

This invention relates to DNA-binding proteins such as transcription factors. 

BACKGROUND 

Most genes are regulated at the transcriptional level by polypeptide transcription 
factors that bind to specific DNA sites within in the gene, typically in promoter or enhancer 

10 regions. These proteins activate or repress transcriptional initiation by RNA polymerase at 
the promoter, thereby regulating expression of the target gene. Many transcription factors, 
both activators and repressors, are modular in structure. Such modules can fold as 
structurally distinct domains and have specific functions, such as DNA binding, dimerization, 
or interaction with the transcriptional machinery. Effector domains such as activation 

1 5 domains or repression domains retain their function when transferred to DNA-binding 
domains of heterologous transcription factors. Brent and Ptashne (1985) Cell 43:729-36; 
Dawson et al, (1995) Mol CellBiol 15:6923-31. The three-dimensional structures of many 
DNA-binding domains, including zinc finger domains, homeodomains, and helix-turn-helix 
domains, have been determined from NMR and X-ray crystallographic data. 

20 Zinc finger domains are one type of structural domain that is modular in function. 

Zinc finger proteins (ZFPs) can be used to regulate transcription. For example, Kim and 
Pabo demonstrated that the Zif268 protein efficiently repressed VP16-activated transcription of 
a target gene when the Zi£268 protein was bound near the transcription start site of a target gene. 
Kim and Pabo (1997) J Biol Chern. 272:29795-29800. Liu et al. describe up-regulating VEGF- 

25 A using engineered zinc finger proteins constructed by site-specific mutagenesis. Liu et al. 
(2001). J. Biol Chem. 276, 11323-11334. 
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SUMMARY 

In one aspect, the invention features a polypeptide that includes a DNA binding 
domain and can regulate expression of a gene in a cell, e.g., an eukaryotic cell. In one 
embodiment, the polypeptide binds to a target DNA site in the gene. The DNA binding 

5 domain typically includes at least three zinc finger domains. 

In one embodiment, at least one, two, or three of the zinc finger domains are 
naturdly-occurring zinc finger domains. For example, these domains can be identical to 
zinc finger domains of different naturally occurring proteins, or identical to non-adjacent zinc 
finger domains from a naturally occurring protein. All the zinc finger domains can be 

10 natoaUy-occurring. 

In another embodiment, at least one, two, or three of the zinc finger domains is a 
variant of a naturally-occurring zinc finger domain, e.g., a domain that differs by between 
one and four or two and five amino acid residues. The polypeptide may include a 
seffi kj aa ^ ea Q f naturally-occurring zinc fing er doma in s and variant domains. ~ 

1 s The polypeptide may regulate any gene. Regulation can be direct such as when the 

polypeptide interacts with a target site in the target gene. For example, the gene can be an 
endogenous gene of a cell (e.g., a gene present in a natural genome), a heterologous gene 
(e.g., a transgene) or a viral gene. In one embodiment, the endogenous gene encodes a 
secreted polypeptide or a polypeptide that participates in or regulates production of a secreted 

20 factor, e.g., a secreted polypeptide. In one embodiment, the endogenous gene regulates cell 
proliferation, cell migration, or tissue morphogenesis (e.g., angiogenesis). 

In one embodiment, the endogenous gene encodes a polypeptide that regulates 
hormone synthesis, a hormone, or growth factor. Exemplary growth factors include the 
VEGF family of growth factors. 

25 VEGF-A is one member of this family. In one embodiment, the polypeptide 

recognizes a target site in the regulatory region of the VEGF-A gene, e.g., a site located 
between -950 and +450 of the VEGF-A gene. For example, the polypeptide can recognize a 
site that is located at 

about -680, -677, -671, -668, -665, -633R, -632R, -631, -630, -606, -603, -554, -536, 
30 -495, -475, -468, -465, -462, -455, -395R, -394R, -393R, -392, -391R, -385R, -382R, -358R, 
-314R, -282, -206, -206, -203, -184, -181, -137, -124, -90R, -85, -30, 77, 244R, 283R, 342, 
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357, 366, 434, 435, or 474R of the human VEGF-A promoter, or a site within 60, 50, 20, 10, 
5, or 3 nucleotides of such sites. These nucleotide positions indicate the 5' most nucleotide 
of the site from transcriptional initiation site on the upper strand of the promoter, unless the 
letter "R appears, in which case, the numbering of those positions (with the R designation) 
5 indicates the 5* most nucleotide site on the lower(reverse) strand. For example, F435(-90R) 
target sequence is 5':-90 to 3':-98 on reverse strand (5':-98 to 3':-90 position from 
transcriptional initiation site on upper strand). In one embodiment, the polypeptide competes 
with a polypeptide having a sequence described herein for binding to its target site in the 
VEGF-A gene. 

1 0 In one embodiment, the target site is in a regulatory region of the endogenous gene 

overlaps with a DNase hypersensitive site, or overlaps with the binding site of an endogenous 
transcription factor. In another embodiment, the target site is within 700, 500, 300, 200, 50, 
20, 10, 5, or 3 basepairs of such a site or region. In one embodiment, the polypeptide binds 

to the target site with a dissociation constant of no more than 20, 7, 5, 3, 2, 1, 0.5, or 0.05 nM. 

1 5 In one embodiment, when the polypeptide is in a cell, it is able to alter transcription 

(e.g., represses or activates) of the endogenous gene at least 1.25, 1.5, 1.7, 1.9, 2.0, 2.5, 5, 10, 
20, 50, or 100 fold. The polypeptide may have a similar effect when in a cell in an organism. 

to one embodiment, the DNAbmding domain includes at least two-zmoiinger 

domains listed in a single row of Table 1, Table 2, Table 3, Table 4, or Table 5 or includes at 

20 least two zinc finger domains that have identical DNA contacting residues as two zinc finger 
domains listed in a single row of Table 1, Table 2, Table 3, Table 4, or Table 5. 

The polypeptide can further include a transcriptional activation or repression domain. 
The polypeptide can further include a cell transduction domain, e.g., the HIV tat transduction 
domain. 

25 In one embodiment, the polypeptide suppresses induction of VEGF-A production by 

hypoxia in a mammalian cell. The suppression can be, e.g., such that VEGF-A levels are less 
than 80, 70, 60, 50, 40, 30, 20, 10, 5, 3, 2, 1, or 0.1% of the protein level induced by hypoxia 
in an otherwise identical cell that lacks the polypeptide 

The invention also provides a nucleic acid that includes a sequence that encodes a 

30 polypeptide described herein and a cell (e.g., a prokaryotic or eukaryotic, e.g., mammalian 
cell) that includes the nucleic acid. The cell can express the nucleic apid and produce the 



( 

WO 2004/053130 



PCT/KR2003/002693 



polypeptide. In one embodiment, the cell is cultured in vitro. The cell can be immuno- 
isolated or encapsulated. The invention also provides an organism that includes one or more 
cells in which the polypeptide is produced and an endogenous gene is regulated by the 
polypeptide. 

5 In another aspect, the invention features a method of regulating an endogenous gene, 

the method including: providing a cell that includes a coding nucleic acid encoding an 
artificial polypeptide that includes at least three zinc finger domains, wherein the polypeptide 
binds to a target DNA site in an endogenous gene; and expressing the coding nucleic acid in 
the cell under conditions in which the artificial polypeptide is produced, binds to the target 

10 DNA site, and regulates the endogenous gene. In one embodiment, at least two of the zinc 
finger domains are naturally-occxming zinc finger domains. For example, the two zinc finger 
domains can be identical to zinc finger domains of different naturally occurring proteins, or 
can be non-adjacent zinc finger domains from the same naturally occurring protein. 

In one embodiment, the artificial polypeptide includes a transcriptional activation or 

15 repression domain. The endogenous gene can be repressed or activated. In one embodiment, 
the cell is provided by contacting the cell with a nucleic acid delivery vehicle, e.g., a 
liposome, virus, or viral particle. In one embodiment, the cell is a cell within an organism, 
e.g^ a mammalian organism. The method can further include, prior to the expressing, 
introducing the cell into a subject organism, or encapsulating the cell and introducing the 

20 encapsulated cell into a subject organism. 

Exemplary polypeptides can include at least two or more zinc finger domains, e.g., 
two, three or four zinc finger domain in a particular row of a table below: 
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Table 1: Exemplary VEGF-A Binding Proteins (A) 



Name Motifs (Col;. 2) Specific Domains (Col. 3) 



F475 


mQSHR-mRDHT-mRSNR 


QSHR2-RDHT-RSNR 


F121 


mQ SHT -ipRS HR-mRDHT 


QSHT-RSHR-RDHT 


F435 


mQSHR^mRDHT-mRSHR 


QSHR2 -RDHT-RSHR 


F547 


mRS HR-mRDHT -mVSNV 


RS HR-RDHT - V SNV 


F2825 


mQSHV-mRDHR-mRDHT 


QSRV-RDHR1 -RDHT 



Table 2: Exemplary VEGF-A Binding Proteins (B) 



Name 


Motifs (Col. 


2) 


Specific Domains (Col. 3) 


F480 


mRSHR-mRDHT- 


-mRSHR 


RSHR- RDHT-RSHR 


FZB28 


mCSNR-mWSNR- 


-mRDHR 


CSNR1 -WSNR-RDHR1 


F625 


mCSNR-mWSNR- 


-mRSHR 


CSNR1-WSNR-RSHR 


F2830 


mDSNR-mWSNR- 


-mRDHR 


DSNRa-WSNR-RDHRl 


F2838 


mDSNR-mWSNR- 


-mRSHR 


DSNRa-WSNR-RSHR 



5 Table 3: Exemplary VEGF-A Binding Proteins (C) 



Name 



Motifs (Col. 2) 



Specific Domains (Col. 3) 



F109 mRDER-mQSSR-mQSHT-mRSNR 

F2 604 mDSAR-mRSNR-mRDHT-mVSSR 

F26Q 5 .mQSHXr-mDSAR-mRSNR-mRDHT 

F2607 mRDHT -mVSNV-mQS HT -mDS AR 

F2615 mRS HR-mD S CR-mQS HT -mDS CR 

F2 63 3 mQSNR-mQSHR-mRDHT-mRSNR 

F2 6 3 4 mC SNR-mRDHT -mRSNR-mRS HR 

F2636 mRSHR-mQSHT-mRSHR-mRDER 

F2 6 4 4 mQSNR-mRSHR-mQSSR-mRSHR 

F2 6 4 6 mQSHT-mDSCR-mRDHT-mCSNR 

F2 650 mQSHT-mWSNR-mRSHR-mWSNR 

F2 6 7 9 mVSNV-mRS HR-mRDER-mQSNV 



RDER1-QSSR1-QSHT-RSNR 
DSAR2 -RSNR-RDHT-VSSR 
QSHT-DSAR2 --RSHR-RDHT 
RDHT- VSNV-QSHT- DS AR2 
RSHR-DSCR-QSHT-DSCR 
QSNR3-QSHR2 -RDHT-RSNR 
CSNR1-RDHT-RSNR-RSHR 
RSHR-QSHT-RSHR-RDER1 
QSNR3 -RS HR-QS SRI -RSHR 
QSHT-DSCR-RDHT-CSNR1 
QSHT-WSNR-RSHR-WSNR 
VSNV-RSHR-RDER1-QSNV2 



Table 4: Exemplary VEGF-A Binding Proteins (D) 



Name 


Motifs (Col. 2) 


Specific Domains (Col. 3) 


F2610 
F2612 
F2638 


mRSNR-mRS HR -mRDHT -mRS HR 
mRS HR-mRDHT -mRS HR-mRDH T 
mRSNR-mQ S HR-mRDH T -mRS HR 


RSNR-RSHR-RDHT-RSHR 
RSHR-RDHT-RSHR-RDHT 
RSNR-QSHR2 -RDHT-RSHR 
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Table 5; Exemplary VEGF-A Binding Proteins (E) 



Name 
F2608 



Motifs (Col. 2) 



F2611 

F2617 

F2619 

F2623 

F2625 

F2628 

F2629 

F2630 

F2635 

F2637 

F2642 

F2643 

F2648 

F2651 

F2653 

F2654 

F2662 

F2667 

F2668 

F2673 

F2682 

F2689 

F2697 

F2699 

F2703 

F2702 



mRSHR-mRDHT- 
mRSHR-mRSHR- 
mRDER-mRSHR 
mRSHR-mVSTR- 
mQSHT-mRSNR 
mQSHT-mWSNR- 
rnVSSR-mWSNR 
mQSHR-mVS SR 
mRDER-mQSHR 
mQSHR-mRSNR 
mRDHT-mRSNR- 
mRDH T -mRS HR • 
mRSHR-mCSNR- 
mQS S R-mQS HR- 
mVSTR-mQSHT- 
mVSTR-mQSNR- 
mQSNR-mRSHR- 



mDSCR-mRDHT- 
mRSHR-mDSCR- 
mRSHR-mRSHR- 
mRDHT -mVS S R- 
mRSNR-mQSSR- 
mRSNR-iriDSAR- 
mRS HR-mC SNR- 
mRSNR-mQSHT- 
mQS HR-mRSHR- 
mRSHR-mQSHR- 



■mVSNV-mQSHT 
mWSNR-mRSHR 
•mDSCR-mQSHT 
mQSNR-mRDHT 
mWSNR-mRDER 
mRDHT-mRDER 
-mRSNR-mVS SR 
-mWSNR-mRSNR 
-mVSSR-mWSNR 
-mQS HR-mRDHT 
-mRSHR-mWSNR 
-mCSNR-mRDHT 
•mRDHT-mCSNR 
-mRSNR-mRSNR 
mWSNR-mRSHR 
•mRSHR-mQSNR 
mQSNR-mVSNV~~ 



Specific Domains (Col. 3) 



RSHR-RDHT-VSNV-QSHT 
RSHR-RSHR-WSNR-RSHR 
RDER1-RSHR-DSCR-QSHT 
RSHR-VSTR-QSNR3-RDHT 
QSHT-RSNR-WSNR-RDER1 
QSHT-WSNR-RDHT-RDER1 
VS SR-WSNR-RSNR- VS SR 
QSHR2-VSSR-WSJJR-RSNR 
RDER1-QSHR2-VSSR-WSNR 
QSHR2-RSNR-QSHR2-RDHT 
RDHT-RSNR-RSHR-WSNR 
RDHT-RSHR-CSNR1-RDHT 
RSHR-CSNR1-RDHT-CSNR1 
QS SR1-QSHR2 -RSNR-RSNR 
VSTR-QSHT-WSNR-RSHR 
VSTR-QSNR3-RSHR-OSNR3 



-mVSTR-mRDER 
-mRDH T -mRS HR 
-mQSNV-mQSNV 
-mRDER-mQSSR 
-mQSNR-mRSHR 
-mQSNR-mQSHT 
■mQSHT-mRSNR 
-mDSAR-mRSHR 
■mRDER-mRSHR 
■mRS HR-mQSNV 



QSNR3-RSHR-QSNR3-VSNV 

DSCR-RDHT-VSTR-RDER1 

RSHR-DSCR-RDHT-RSHR 

RSHR-RSHR-QSNV2-QSNV2 

RDHT - VS SR-RDER1 -QS SRI 

RSNR~QSSR1-QSNR3~RSHR 

RSNR- DS AR2 -QSNR3-QSHT 

RSHR-CSNR1-QSHT-RSNR 

RSNR-QSHT-DSAR2 -RSHR 

QSHR2-RSHR-RDER1-RSHR 

RSHR-QSHR2 -RSHR-QSNV2 
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In one aspect, the invention features a polypeptide that includes a DNA binding 
domain. The DNA binding domain has a plurality of zinc finger domains. The polypeptide 
can alter the expression or production of VEGF-A in cells. For example, the polypeptide can 
alter the normal response of the cells to a signal that would increase or decrease VEGF-A 
production or expression. In one embodiment, the polypeptide can suppresses induction of 
VEGF-A production or expression in cells under conditions in which VEGF-A production or 
expression is induced. For example, the suppression can have a magnitude such that VEGF- 
A protein ormRNA levels are less than 80, 70, 60, 50, 40, 30, 20, 10, 5, 3, 2, 1, or 0.5% of 
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the level induced by the condition in otherwise identical cell that lacks the polypeptide. In 
one embodiment, the condition includes hypoxia. 

These conditions can be determined with particularity in human embryonic kidney 
293F cell, e.g., as described in the examples below. 
5 The polypeptide can be used in a wide variety of implementations, e.g., in a human 

cell in culture or in an organism, e.g., in a human or non-human mammalian organism. 

In one embodiment, the polypeptide binds to a site in the human VEGF-A gene. In 
another embodiment, the polypeptide functions indirectly, e.g., it binds to a site in another 
gene. 

1 0 In one embodiment, the polypeptide includes a repression domain. The polypeptide 

can include other features described herein. The invention also features a composition, e.g., a 
pharmaceutical composition that includes the polypeptide or a nucleic acid encoding the 
polypeptide. 

The composition can be administe r ed to a subject, e.g., i n an amount e f fe ct iv e^) 

15 reduce angiogenesis in the subject, e.g., in the vicinity of a lesion in the subject (e.g., a 

neoplasm) or throughout the subject. In one embodiment, the subject is a human that has or 

is suspected of having a metastatic cancer. 

With respect to any featured polypeptide, the polypeptide can further include a 
20 heterologous sequence, e.g., a nuclear localization signal, a small molecular binding domain 
(e.g., a steroid binding domain), an epitope tag or purification handle, a catalytic domain (e.g., 
a nucleic acid modifying domain, a nucleic acid cleavage domain, or a DNA repair catalytic 
domain), a transcriptional function domain (e.g., an activation domain, a repression domain, 
and so forth), a protein transduction domain (e.g., from HIV tat), and/or a regulatory site (e.g., 
25 a phosphorylation site, ubiquitination site, or protease cleavage site). 

The polypeptide can be formulated in a pharmaceutical composition, e.g., with one or 
more additional components. The composition or polypeptide can be included in a kit that 
also includes another agent or instructions for use, e.g., therapeutic use.. 

The polypeptide can be attached (covalently or non-covalently) to a solid support, e.g., 
30 a bead, matrix, or planar array. The polypeptide can also be attached to a label such as a 
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radioactive compound, a fluorescent compound, another detectable entity, or a component of 
a detection system (e.g., a chemiluminescent agent). 

The invention also includes an isolated nucleic acid that includes a sequence encoding 
one of the aforementioned polypeptides. The nucleic acid can further include an operably 
5 linked regulatory sequence, e.g., a promoter, a transcriptional enhancer, a 5 ' untranslated 
region, a 3' untranslated region, a virus packaging sequence, and/or a selectable marker. The 
nucleic acid can be packaged in a virus, e.g., a virus that can infect a mammalian cell, e.g., a 
lentivirus, retrovirus, pox virus, or adenovirus. 

The invention further provides a cell that contains the polypeptide or the nucleic acid 
1 0 that includes a sequence encoding the polypeptide. The cell can be within a tissue in a 

subject organism or in culture. The cell can be an animal (e.g., mammalian, e.g., a human or 
non-human cell), plant, or microbial (e g., fungal or bacterial) cell. The cell can be prepared 
by introducing the polypeptide into the cell or a parent cell or by introducing the nucleic acid 

into th e cell or parent cell. The nucleic acid can be used to p r oduce t h e polypep t ide in i he 

15 cell. 

The invention still further includes a non-human transgenic mammal, e.g., a mouse, 
rat, pig, rabbit, cow, goat, or sheep. The genetic complement of the transgenic mammal 
includes the nucleic acid sequence encoding the : chimeric zinc finger polypeptide described 
above and elsewhere herein. The invention also includes method of producing the 
20 polypeptide, e.g., by expressing the nucleic acid, and of using the polypeptide, e.g., to 
regulate endogenous genes or viral genes in a cell. 

With respect to any polypeptide herein that regulates VEGF-A the polypeptide can be 
used in a method of regulating VEGF-A expression in a cell. The method includes 
introducing the polypeptide or a nucleic acid that includes a sequence encoding the 
25 polypeptide into a cell. For example, the polypeptide can be introducing using a liposome or 
by fusion to a protein transduction domain. A nucleic acid can be introduction, e.g., by 
transfection or viral delivery. 

The invention also features a composition, e.g., a pharmaceutical composition that 
includes a polypeptide that regulates VEGF-A, e.g., as described herein, or a nucleic acid 
30 encoding the polypeptide. In one embodiment, the polypeptide can suppress VEGF-A 

expression and the composition can be a<lministered to a subject, e.g., in an amount effective 

-8- 
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to reduce angiogenesis in the subject, e.g., in the vicinity of a lesion in the subject (e.g., a 
neoplasm) or throughout the subject. In one embodiment, the subject is a human that has or 
is suspected of having a metastatic cancer. 

In another embodiment, the polypeptide can increase VEGF-A expression, and the 

5 composition is administered to a subject, in an amount effective to increase angiogenesis in 
the subject. For example, increased angiogenesis can required for ,e.g., vascular formation, 
embryonic development, somatic growth, differentiation of nerve system, maintenance of 
pregnancy, wound healing etc.. The vascular endothelial growth factor (VEGF-A), one of 
endothelial cell specific growth factor, is s a key factor that regulates endothelial cell growth 

1 0 and differentiation. 

Insufficient levels of VEGF or its VEGFi64 and VEGFigg isoform lead to post-natal 
angiogenesis and ischemic heart disease. Activation of VEGF-A can be used for the 
treatment or prevention of peripheral artery disease and coronary artery disease. For example, 

the subj e ct can be a human that has or is suspe cte d of having a wound (internal or external), 

15 pregnancy, a neurological problem, an embryonic developmental problem, a cardiovascular 
disease (e.g., ischemic heart disease, peripheral artery disease, or coronary artery disease). 
At least 5 isoforms of VEGF-A protein are produced from different splice variants. 
_jniesekpf orms Jraye differ^ 
finger protein, e.g., a protein described herein, may, in some implementations, enable 

20 upregulation of entire or particular splice variants that are important for a desired clinical 
outcome. For example, the zinc finger protein may modulate expression all splice variants, 
or it may modulate expression of a subset of splice variants, e.g., at least one splice variant. 

In another aspect, the invention features an encapsulated composition that includes an 
encapsulation layer composed of a biocompatible material, and recombinant mammalian 

25 cells, wherein the cells contain a nucleic acid comprising a sequence encoding a chimeric 

2inc finger protein that regulates production of a factor, e.g., secreted factor or a non-secreted 
protein, e.g., a cytoplasmic protein. In one embodiment, the biocompatible material is 
permeable at least to proteins having a molecular weight of 10, 20, 30, or 40 kDa. The 
biocompatible material can retain proteins larger than, e.g., 50, 100, 120, or 200 kDa. 

30 The invention also provides a rapid and scalable cell-based method for identifying 

and constructing chimeric proteins, e.g., transcription factors. Such transcription factors can 
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be used, for example, for altering the expression of endogenous genes in biomedical and 
bioengineering applications. Activity of the transcription factors can be assayed in vivo and 
in cultured cells, e.g., in intact, living cells in culture. 

In yet another aspect, the invention features a method of characterizing a chimeric 

5 zinc finger protein, e.g., a zinc finger protein described herein. The method includes: 
introducing a nucleic acid that encodes the protein into a cell; expressing the nucleic acid; 
and evaluating expression of a target gene. For example, the evaluating can include 
determining the profile of expression of endogenous genes in the cell. Such an expression 
profile includes a plurality of values, wherein each value corresponds to the level of 

10 expression of a different gene, splice-variant or allelic variant of a gene (i.e., mRNA level) or 
the abundance of a translation product (i.e., protein level). The value can be a qualitative or 
quantitative assessment of the level of expression of the gene or the translation product of the 
gene, i.e., an assessment of the abundance of 1) an mRNA transcribed from the gene, or 2) 

th e polypeptide e ncoded by the g ene. ™ ~ 

15 In yet another aspect, the invention features a method of identifying a chimeric zinc 

finger protein that can bind to a particular target site. The method includes: providing data 
records, each record associating an identifier for a naturally-occurring zinc finger domain 
(e.g., a human zinc finger domain ) and at least one 3- or 4-basepair subsite that is reco gniiE ed 
by the zinc finger domain referenced by the identifier; parsing the target site into at least two 

20 3- or 4-basepair subsites; for each of the subsites, retrieving a set of the identifiers from the 
data records, the set comprising identifiers for the zinc finger domains that recognize the 
subsite; and designing a polypeptide that comprises a zinc finger domain for each of the 
subsites, the zinc finger domain being referenced by an identifier from the set for the 
respective subsite. 

25 The data records can include a record that identifies a zinc finger domain of interest. 

The method can further include the step of synthesizing a nucleic acid that encodes the 
polypeptide and/or synthesizing the polypeptide in vitro. The method can also include the 
step of assessing the binding of the polypeptide to the target site, e.g., using an in vitro 
binding assay or an in vivo assay such as an assay for target gene expression. The 

30 synthesized polypeptide can further include an activation or repression domain. 
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In one embodiment, the method further includes assessing the ability of the 
polypeptide to alter the expression of one or more endogenous genes. The assessing can 
include profiling the expression of multiple endogenous genes, e.g., using nucleic acid 
microairays, or a single or limited number of genes. The method can also further include 
5 contacting the polypeptide with a DNA that includes the target site, e.g., in vitro. 

In another embodiment, the method further includes retrieving a nucleic acid 
encoding the polypeptide from an addressed library of nucleic acids, each nucleic acid of the 
library including a sequence encoding first and second zinc finger domains. 

In another aspect, the invention features certain polypeptides and isolated nucleic 

10 acids. A polypeptide of the invention can include, for example, one, two, three, or four zinc 
finger domains and be related to a reference polypeptide that has a particular amino acid 
sequence provided herein. For example, the polypeptide can have the same DNA-contacting 
residues in one, two, three, four or more zinc finger domains as the DNA-contacting residues 
in respective zinc finger domains ofthe refe rence polypeptide. In anot h er example, in three 

15 zinc finger domains of the polypeptide, at least 9, 10, or 1 1 of the DNA-contacting residues 
(3 x 4) are identical to the DNA-contacting residues of respective zinc finger domains in the 
reference polypeptide. In another example, in four zinc finger domains of the polypeptide, at 
least J2> .13, 14, or 15 x>f the. DNAr contacting residues (4 x 4) are identical to the DNA- 
contacting residues of respective zinc finger domains in the reference polypeptide. The 

20 polypeptide can be able to bind to the same site as the reference polypeptide, and regulate the 
same endogenous gene, e.g., within 0.1 to 10 or 0.5 to 1.5 fold of the activity of the reference 
polypeptide. 

In one embodiment, the amino acid sequences of one or more (e.g., all) of the zinc 
finger domains are naturally occurring sequences. In one embodiment, the polypeptide is 
25 able to regulate a target gene, e.g., an endogenous cellular gene, e.g., the same gene as the 
reference polypeptide, e.g., VEGF-A. 

In addition, purified polypeptides of the invention can have an amino acid sequence at 
least 50%, 60%, 70%, 80%, 90%, 93%, 95%, 96%, 98%, 99%, or 100% identical to a zinc 
finger domain described herein. The polypeptides can be identical to a zinc finger domain 
30 described herein at the amino acid positions corresponding to the DNA contacting residues of 
the polypeptide. Alternatively, the polypeptides differ from a zinc finger domain described 
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herein at least one of the residues corresponding to the DNA contacting residues of the 
polypeptide. For example, one or more zinc finger domains in the polypeptides include a 
conservative substitution at a DNA contacting residue. 

The polypeptides can also differ at least one, two, or three residues, e.g., residues 
5 other than a DNA contacting residue. For example, within a given zinc finger domain, the 
polypeptide may differ by a single amino acid from the amino acid sequences referenced 
above, or by two, three, or four amino acids -from the sequences referenced above. The 
difference may be due to a conservative substitution as defined herein. In one embodiment, 
the amino acids differences with respect to the sequences referenced above are located 
10 between the second zinc-coordinating cysteine and the -1 DNA contacting position (referring 
to the numbering system for DNA contacting positions described below). 

The comparison of sequences and determination of percent identity between two 
sequences can be accomplished using a mathematical algorithm. In particular, the percent 

identity between two amino acid sequences is determined using the JNeedleman and Wunsch 

15 ((1970) J. Mol Biol 48:444-453) algorithm which has been incorporated into the GAP 

program in the GCG software package, using a Blossum 62 scoring matrix with a gap penalty 
of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5. 
. . . . _ The purifiedpolypeptides can^also include_oneor more of the following: a 

heterologous DNA binding domain, a nuclear localization signal, a small molecular binding 
20 domain (e.g., a steroid binding domain), an epitope tag or purification handle, a catalytic 
domain (e.g., a nucleic acid modifying domain, a nucleic acid cleavage domain, or a DNA 
repair catalytic domain) and/or a transcriptional function domain (e.g., an activation domain, 
a repression domain, and so forth). In one embodiment, the polypeptide further includes a 
second zinc finger domain, e.g., a domain having a sequence described herein. For example, 
25 the polypeptide can include an array of zinc fingers that include two or more zinc finger 
domains. In one embodiment, one or more of the domains (e.g., each domain) can have a 
sequence that conforms to a motif described herein, e.g., mCSNR, mDSAR, mDSCR, 
mISNR, mQFNR, mQSHV, mQSNI, mQSNK, mQSNR, mQSNV, mQSSR, mQTHQ, 
mQTHR, mRDER, mRDHT, mRDKR, mRSHR, mRSNR, mVSNV, mVSSR, mVSTR, 
30 mWSNR, mDGNV, mDSNR, and mRDNQ. Further, each domain can have a sequence 
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provided herein. As described below, the small letter "m" prefix indicates that the listed four 
amino acids represent a motif of DNA contacting residues. 

Nucleic acids of the invention include nucleic acids encoding the aforementioned 
polypeptides. A nucleic acid of the invention can be operably regulated by a heterologous 
5 nucleic acid sequence, e.g., an inducible promoter (e.g., a steroid hormone regulated 

promoter, a small-molecule regulated promoter, or an engineered inducible system such as 
the tetracycline Tet-On and Tet-Off systems). In one embodiment, the promoter is inducible 
in a mammalian cell. 

As described herein, the polypeptide can be produced in a cell and can regulate a gene 
10 in the cell, e.g., an endogenous gene, by binding to a target site, e.g., a site that includes a 

subsite that the respective zinc finger domain(s) recognizes. The cell can be mammalian cell. 

The invention further includes a method of expressing a polypeptide described herein, 
fused to a heterologous nucleic acid binding domain. The method includes introducing into a 

c e ll a nucleic acid encoding the afoieme nl ioned fusion protein. 

15 In another aspect, the invention features an encapsulated composition. The 

composition includes an encapsulation layer composed of a biocompatible material and 
recombinant mammalian cells. The cells contain a nucleic acid including a sequence 
encoding a chimeric zinc finger proteinlhatxegulates^roduction of another nudeic^eid in 
the cells, e.g., a heterologous nucleic acid or an endogenous nucleic acid. For example, the 
20 cells can regulate a gene that encodes a secreted polypeptide or that regulates or participates 
in the production of a secreted factor, e.g., a secreted polypeptide. In one embodiment, the 
secreted polypeptide is insulin, an insulin-like growth factor, VEGF-A, HGF, interferon, 
interleukin, or a fibroblast growth factor. 

The encapsulation layer typically is permeable at least to proteins having a molecular 
25 weight of 1 0 kDa, e.g., proteins about 10, 20, 30, 40, 50, or 70 kDa in molecular weight The 
encapsulation layer can be impermeable, e.g., to proteins larger than those molecular weights, 
e.g., larger than 100 kDa. Additional encapsulation layers may be present. The chimeric 
zinc finger protein can include one or more features described herein. 

The term "zinc finger protein" refers to any protein that includes a zinc finger domain. 
30 A protein can include one or more polypeptide chains. Exemplary zinc finger proteins 
include two, three, four, five, six, or more zinc finger domains. Typically the protein is a 
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single chain. However, in some embodiment, the protein can include a plurality of 
polypeptide chains For example, the protein can be a heterodimeric or homodimeric protein. 

The term "base contacting positions," "DNA contacting positions," or "nucleic acid 
contacting positions" refers to the four amino acid positions of zinc finger domains that 
structurally correspond to the positions of amino acids arginine 73, aspartic acid 75, glutamic 
acid 76, and arginine 79 of ZIF268. 

Glu Arg Pro Tyr Ala Cys Pro Val Glu Ser Cys Asp Arg Arg Phe Ser 

15 10 15 

Arg Ser Asp Glu Leu Thr Arg His lie Arg lie His Thr Gly Gin Lys 

20 25 30 

Pro Phe Gin Cys Arg lie Cys Met Arg Asn Phe Ser Arg Ser Asp His 

35 40 45 

Leu Thr Thr His lie Arg Thr His Thr Gly Glu Lys Pro Phe Ala Cys 
15 50 55 60 

Asp lie Cys Gly Arg Lys Phe Ala Arg Ser Asp Glu Arg Lys Arcr His 
65 70 75 80 

Thr Lys lie His Leu Arg Gin Lys Asp (SEQ ID NO: 129) 
85 

20 

These positions are also referred to as positions -1, 2, 3, and 6, respectively. To 
identify positions in a query sequence that correspond to the base contacting positions, the 
query sequence is aligned to the zinc finger domain of interest such that the cysteine and 
mstidiae residues ^f4ireqa 

ClustalW WWW Service at the European Bioinformatics Institute (Thompson et al. (1994) 
Nucleic Acids Res. 22:4673-4680) provides one convenient method of aligning sequences. 

Conservative amino acid substitutions refer to the interchangeability of residues 
having similar side chains. For example, a group of amino acids having aliphatic side chains 
is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic- 
hydroxyl side chains is serine and threonine; a group of amino acids having amide-containing 
side chains is asparagine and glutamine; a group of amino acids having aromatic side chains 
is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains is 
lysine, arginine, and histidine; a group of amino acids having acidic side chains is aspartic 
acid and glutamic acid; and a group of amino acids having sulfur-containing side chains is 
35 cysteine and methionine. Depending on circumstances, amino acids within the same group 
may be interchangeable. Some additional conservative amino acids substitution groups are: 



25 



30 



- 14- 



( 

WO 2004/053130 



( 

PCT/KR2003/002693 



valine-leucine-isoleucine; phenylalanine-tyrosine; lysine-arginine; alanine-valine; aspartic 
acid-glutamic acid; and asparagine-glutamine. 

The term "heterologous polypeptide" refers either to a polypeptide with a non- 
naturally occurring sequence (e.g., a hybrid polypeptide) or a polypeptide with a sequence 
identical to a naturally occurring polypeptide but present in a milieu in which it does not 
naturally occur. For example, the fusion of two naturally occurring polypeptides that are not 
fused together in Nature results in a heterologous polypeptide in which one polypeptide is 
heterologous to the other. 

The term "hybrid" refers to a non-naturally occurring polypeptide that comprises 
amino acid sequences derived from either (i) at least two different naturally occurring 
sequences; (ii) at least one artificial sequence (i.e., a sequence that does not occur naturally) 
and at least one naturally occurring sequence; or (hi) at least two artificial sequences (same 
or different). Examples of artificial sequences include mutants of a naturally occurring 
— sequence and de novo designed sequences. 

As used herein, the term "hybridizes under stringent conditions" refers to conditions 
for hybridization in 6X sodium chloride/sodium citrate (SSC) at 45°C, followed by two 
washes in 0.2 X SSC, 0.1% SDS at 65°C. The invention also features nucleic acids that 
-hybridize under stringent conditions to a nucleicacitt described herein or to a nucleic acid 
encoding a polypeptide described herein. 

The term "binding preference" refers to the discriminative property of a polypeptide 
for selecting one nucleic acid binding site relative to another. For example, when the 
polypeptide is limiting in quantity relative to two different nucleic acid binding sites, a 
greater amount of the polypeptide will bind the preferred site relative to the other site in an in 
vivo or in vitro assay described herein. 

As used herein, the "dissociation constant" refers to the equilibrium dissociation 
constant of a protein (e.g., a zinc finger protein) for binding to a target site of interest. In the 
case of zinc finger protein that recognizes a target site between 9 and 1 8 basepairs, the 
binding is evaluated in the context of a 28-basepair double-stranded DNA. The dissociation 
constant is determined by gel shift analysis using purified protein that is bound in 20 mM 
Tris pH 7.7, 120 mM NaCl, 5 mM MgCl 2 , 20 pM ZnS0 4 , 10% glycerol, 0.1% Nonidet P-40, 
5 mM DTT, and 0. 10 mg/mL BSA (bovine serum albumin) at room temperature. Additional 
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details are provided in Example 1 and Rebar and Pabo ((1994) Science 263:671-673). 
Dissociation constants can be less than 10" 6 , 10" 7 , 10" 8 , or 10" 9 M. 

One polypeptide (the "competing polypeptide") can be said to "compete" with 
another (a polypeptide of interest) for a binding site, if, in an in vitro assay using probe 
5 molecules with the target site, the competing polypeptide is present at a concentration no 
more than 10-fold greater than the concentration of the polypeptide of interest, the number of 
probe molecules bound by the polypeptide of interest is reduced at least 25%. These 
experiments are done at about 50 fold the Kd of the polypeptide of interest for the probe 
molecule. 

10 A given zinc finger domain is said to "bind specifically" to a given 3-base pair DNA 

site if a chimeric protein that includes fingers 1 and 2 of Zif268 and the given zinc finger 
domain has an affinity of at least 5 nM for a target site that includes both the given 3-base 
pair DNA site and the 5-bp sequence, 5M3GGCG-3', that is recognized by fingers 1 and 2 of 

ZU268. Hie terms "recognize" and "specifically bind" are used interchangeably and refer to 

1 5 the discrimination for a binding site by a zinc finger domain in the above Zi£268 fusion assay. 
As used herein, "degenerate oligonucleotides" refers to both (a) a population of 
different oligonucleotides that each encode a particular amino acid sequence, and (b) a single 
species^jf oli g onu cleotide that nan flnneal to mo re than one sequence, e.g., an oligonucleotide 
with an unnatural nucleotide such as inosine. 

20 An "isolated composition" refers to a composition that is removed from at least 90% 

of at least one component of a cellular sample or reaction mixture from which the isolated 
composition can be obtained. Compositions produced artificially or naturally can be 
"compositions of at least" a certain degree of purity if the species or population of species of 
interests is at least 5, 10, 25, 50, 75, 80, 90, 92, 95, 98, or 99% pure on a weight-weight basis. 

25 Any protein or nucleic acid composition described herein can be provided in an isolated form. 
The use of zinc finger domains is particularly advantageous. First, the zinc finger 
structure is capable of recognizing very diverse DNA sequences, but any particular zinc 
finger can have a high degree of specificity for a particular sequence. Second, the structure 
of naturally occurring zinc finger proteins is modular. For example, the zinc finger protein 

30 Zif268, also called "Egr-1," is composed of a tandem array of three zinc finger domains. 
Pavletich and Pabo describe the x-ray crystallographic structure of a fragment of the zinc 
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finger protein Zi£268. Pavletich and Pabo (1991) Science 252:809-817. In this structural 
model, the three Zif268 fingers are complexed with DNA. Each finger independently 
contacts 3-4 basepairs of the DNA recognition site. High affinity binding is achieved by the 
cooperative effect of having multiple zinc finger modules in the same polypeptide chain. 
5 Hie present invention avails itself of all the zinc finger domains present in the human 

genome, or any other genome. This diverse sampling of sequence space occupied by the zinc 
finger domain structural fold may have the additional advantages inherent in eons of natural 
selection. Moreover, by utilizing domains from the host species, a zinc finger protein 
engineered for a gene therapy application by the methods described herein has a reduced 
10 likelihood of being regarded as foreign by the host immune response. It is also possible to use 
non-naturally occurring zinc finger domains, e.g., variants of human or mammalian zinc finger 
domains or completely artificial zinc finger domains. 

The ability to select a DNA binding domain that recognizes a particular sequence 

permits the design of novel proteins that specifically regulate a target gene, such an 

15 endogenous cellular gene. In many implementations, the proteins have therapeutic or 
industrial applications. Other applications are also possible. 

This disclosure also includes a number of examples that demonstrate, using 

treating cancer. The examples show that zinc finger proteins can function as powerful 
20 inhibitors of VEGF-A expression. Since VEGF-A contributes to angiogenesis in tumor 

tissues, zinc finger proteins that modulate (e.g., inhibit) VEGF-A can be used, e.g., to reduce 
angiogenesis in and near tumors. 

All patents, patent applications, and references cited herein are incorporated by 
reference in their entirety. The following patent applications: WO 01/60970 (Kim et aL); 
25 U.S. Published Applications 2002-0061512, 2003-165997, and 2003-194727, and U.S. Serial 
Nos. 10/669,861, 60/431,892 and 60/477,459 are expressly incorporated by reference in their 
entirety for all purposes. The details of one or more embodiments of the invention are set 
forth in the accompanying drawings and the description below. Other features, objects, and 
advantages of the invention will be apparent from the description and drawings, and from the 
30 claims. 
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DESCRIPTION OF DRAWINGS 
FIG. 1A , IB, and 1C list the nucleic acid sequence (SEQ ED NO: 120) of an 
exemplary region of the human VEGF-A gene. The region includes the promoter. The 
sequence is from GENBANK® entry AF095785.1. The transcriptional initiation site is at 
5 about nucleotide 2363 . The start codon is at about nucleotide 340 1 . 

FIG. 2A , 2B, 2C, 2D, 2E, and 2F list the nucleic acid sequence (SEQ ID NO: 121) 
of an exemplary region of the human transforming protein (FGF4) gene. The region includes 
the promoter. The sequence is from GENBANK® entry J02986.1 and AP006345.2 {Homo 
sapiens genomic DNA, chromosome 11 clone:RPll-186D19, complete sequence). The 
10 transcriptional initiation site is at about nucleotide 373 1 . The start codon is at about 
nucleotide 3959. 

FIG. 3A, 3B, 3C, 3D, and 3E list the nucleic acid sequence (SEQ ID NO:122) of an 
exemplary region of the human hepatocyte growth factor (HGF) gene. The region includes 
Ihe promoter. The sequence is from CiliNBANK® entry AC004960. 1 for Homo sapiens 
15 PAC clone RP5-1098B1 from 7ql 1.23-q21 The transcriptional initiation site is at about 
nucleotide 4389. The start codon is at about nucleotide 4454. 
FIG. 4 is a schematic of the VEGF-A promoter. 

FIG^S A p r ovid es-s c he m ati c s of exemplary ^mcleic-acid constoicts^for expressing 

zinc finger proteins with KRAB domains. 
20 FIG. 5B provides a schematic of an exemplary luciferase reporter construct that 

contains the VEGF-A promoter. 



DETAILED DESCRIPTION 

Chimeric zinc finger proteins that include at least one zinc finger domain can be used 
25 to regulate the expression of genes within cells. Zinc finger protein can include two or more 
naturally-occurring zinc finger domains. In one set of examples, chimeric zinc finger 
proteins are used to regulate the VEGF-A gene in a mammalian cell. 

Chimeric zinc finger proteins can be obtained by a variety of methods. 
In one embodiment, these proteins are designed to recognize a target DNA site. 
30 Useful target sites include sites in a regulatory region of the target gene or within 1 kb or 
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500 bp of a regulatory region of a target gene. For example, the target site can be within 1 kb 
or 500 bp of the TATA box or transcriptional start site of a gene. One method for designing a 
zinc finger protein includes parsing target sites into 3 or 4 basepair sequences that can be 
recognized by an individual zinc finger domain. Then a nucleic acid is constructed which 
5 includes a sequence that encodes a protein that has consecutive zinc finger domains 
corresponding to the parsed elements. A plurality of different nucleic acids that encode 
candidate proteins is constructed and expressed in a host cell. The expression of the target 
gene is evaluated to identify one or more of the candidates that is able to regulate expression 
of the target gene. 

10 In another embodiment, a chimeric zinc finger protein is selected from a library of 
zinc finger domains based on its phenotypic effect in a cell. For example, a nucleic acid 
library that encodes random chimeras of zinc finger domains is transformed into mammalian 
culture cells. Nucleic acids of the library are expressed in the cells. The cells are evaluated 
foi a pheuotype^of interest, and cells in which the phenotype is altered relative to a control 

1 5 are isolated. The library nucleic acids in such cells are recovered, and the zinc finger protein 
encoded by such recovered nucleic acids can be further characterized, utilized, or modified. 

Zinc Finger Domains 

Zinc finger domains are small polypeptide domains of approximately 30 amino acid 
residues in which there are four residues, either cysteine or histidine, appropriately spaced 
20 such that they can coordinate a zinc ion (for reviews, see, e.g., Klug and Rhodes, (1987) 

Trends Biochem. ScL 12:464-469(1987); Evans and Hollenberg, (1988) Cell 52:1-3; Payre and 
Vincent, (1988) Lett 234:245-250; Millers al, (1985)EMBOJ. 4:1609-1614; Berg, 
(1988) Proc. Natl. Acad Sci. U.S.A. 85:99-102; Rosenfeld andMargalit, (1993) J. BiomoL 
Struct. Dyn. 1 1 :557-570). Hence, zinc finger domains can be categorized according to the 
25 identity of the residues that coordinate the zinc ion, e.g., as the Cys 2 -His 2 class, the Cys 2 -Cys 2 
class, the Cys 2 -CysHis class, and so forth. The zinc coordinating residues of Cys 2 -His 2 zinc 
fingers are typically spaced as follows: 

C-X 2 . 5 -C-X3-X a -X5-\|/-X 2 -H-X 3 -5-H, (SEQ ID NO: 123) 
where \|/ (psi) is a hydrophobic residue (Wolfe et al, (1999) Anrm. Rev. Biophys. 
30 BiomoL Struct. 3:183-212), "X" represents any amino acid, the subscript number indicates 
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the number of amino acids, and a subscript with two hyphenated numbers indicates a typical 
range of intervening amino acids. In many zinc finger domains, the initial cysteine is 
preceded by phenylalanine or tyrosine and then a non-cysteine amino acid Typically, the 
intervening amino acids fold to form an anti-parallel P-sheet that packs against an a-helix, 
5 although the anti-parallel ^-sheets can be short, non-ideal, or non-existent. The fold 
positions the zinc-coordinating side chains so they are in a tetrahedral conformation 
appropriate for coordinating the zinc ion. The base contacting residues are in the loop region 
between the pair of metal chelating residues. 

For convenience, the primary DNA contacting residues of a zinc finger domain are 
10 numbered: -1, 2, 3, and 6 based on the following example: 

-112 3 4 5 6 
C-X2.5-C-X3-Xa-X-R-X-D-E~X b -X-R-H-X 3 -5-H (SEQ ID NO: 124), 
As noted in the example above, the DNA contacting residues are Arg (R), Asp (D), 
<ilu and Arg (K). The above motif can be abbreviated RDEIL As used herein, such 
15 abbreviation is a shorthand that refers to a particular polypeptide sequence from the second 
residue preceding the first cysteine (above, initial residue of SEQ ED NO: 124) to the ultimate 
metal-chelating histidine (ultimate residue of SEQ ID NO: 1 24). In the above motif and 
others, X a 4s frequently aromatic, and Xb4s^equentlyiy^c^bie. Where^wo^dififeent 
sequences have the same motif, a number may be used to indicate each sequence (e.g., 
20 RDER1 or RDER2). 

In certain contexts where made explicitly apparent, the four-letter abbreviation refers 
to the motif in general. In other words, the motif specifies the amino acids at positions -1,2, 
3, and 6, while the other positions can be any amino acid, typically, but not necessarily, a 
non-cysteine amino acid. The small letter "m" before a motif can be used to make explicit 
25 that the abbreviation is referring to a motif. For example, mRDER refers to a motif in which 
R appears at positions -1, D at position 2, E at position 3, and R at position 6. 

A zinc finger DNA-binding protein may consist of a tandem array of three or more 
zinc finger domains. 

The zinc finger domain (or "ZFD") is one of the most common eukaryotic DNA- 
30 binding motifs, found in species from yeast to higher plants and to humans. By one estimate, 
there are at least several thousand zinc finger domains in the human genome alone, possibly 
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at least 4,500. Zinc finger domains can be identified in or isolated from zinc finger proteins. 
Non-limiting examples of zinc finger proteins include CF2-II; Kruppel; WT1; basonuclin; 
BCL-6/LAZ-3; erythroid Kruppel-like transcription factor; transcription factors Spl, Sp2, 
Sp3» and Sp4; transcriptional repressor YY1; EGRl/Krox24; EGR2/Krox20; EGR3/Pilot; 
5 EGR4/AT133; Evi-1; GLI1; GLI2; GLI3; HIV-EP 1/ZNF40; HIV-EP2; KR1; ZfX; ZfY; and 
ZNF7. 

An artificial transcription factor can include chimeras of available zinc finger domain. 
In one embodiment, one or more of the zinc finger domains is naturally occurring. Many 
exemplary human zinc finger domains are described in US 2002-0061512, US 2003-165997, 
10 and U.S.S.N. 60/43 1,892. See also Table 6 below. The binding specificities of each domain, 
can be used to design a transcription factor with a particular specificity. 

Table 6: Exemplary Zinc Finger Domains 

ZFD Amino Acid Sequence SEQ ID NO: 



CSNR1 YKCKQCGKAFGCPSNLRRHGRTH 1 

DSAR2 YSCGICGKSFSDSSAKRRHCILH 2 

DSCR YTCSDCGKAFRDKSCLNRHRRTH 3 

QSHR2 YKCGQCGKFYSQVSHLTRHQKIH 4 

QSHT YKCEECGKAFRQSSHLTTHKIIH 5 

QSNR3 YECEKCGKAFNQSSNLTRHKKSH 6 

QSNV2 YVCSKCGKAFTQSSNLTVHQKIH 7 

QSSR1 YKCPDCGKSFSQSSSLIRHQRTH 8 

RDER1 YVCDVEGCTWKFARSDELNRHKKRH 9 

RDHT FQCKTCQRKFSRSDHLKTHTRTH 10 

RSHR YKCMECGKAFNRRSHLTRHQRIH 11 

RSNR YICRKCGRGFSRKSNLIRHQRTH 12 

VSNV YECDHCGKAFSVSSNLNVHRRIH 13 

VSSR YTCKQCGKAFSVSSSLRRHETTH 14 

VSTR YECNYCGKTFSVSSTLIRHQRIH 15 

WSNR YRCEECGKAFRWPSNLTRHKRIH 16 

QSHV YECDHCGKSFSQSSHLNVHKRTH 17 

RDHR1 FLCQYCAQRFGRKDHLTRHMKKS 18 

DSNRa* YRCK YC DRS F S DS S NLQRH VRN I H 19 
# indicates that the domain is not a naturally occurring human domain. 

1 5 Additional exemplary zinc finger domains include domains with the following motifs: 

mCSNR, mDSAR, mDSCR, mISNR, mQFNR, mQSHV, mQSNI, mQSNK, mQSNR, 
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mQSNV, mQSSR, mQTHQ, mQTHR, mRDER, mRDHT, mRDKR, mRSHR, mRSNR, 
mVSNV, mVSSR, mVSTR, mWSNR, mDGNV, mDSNR, and mRDNQ. 

It is also possible to use other types of DNA binding domains, e.g., at least one 
domain other than a zinc finger domain. The invention utilizes collections of nucleic acid 
binding domains with differing binding specificities. A variety of protein structures are 
known to interact nucleic acids with high affinity and high specificity. For reviews of 
structural motifc which recognize double stranded DNA, see, e.g., Pabo and Sauer (1992) 
Annu. Rev. Biochem. 61:1053-95; Patikoglou andBurley (1997) Anrtu. Rev. Biophys. Biomol. 
Struct 26:289-325; Nelson (1995) Curr Opin Genet Dev. 5:180-9). A few non-limiting 
examples of nucleic acid binding domains, other than zinc finger domains, include: 
homeodomains, helix-turn-helix domains, winged helix domains, and helix-loop-helix 
domains. 

Transcription Factor Features 

In addition to a DNA-binding domain, a transcription factor may optionally include a 
regulatory domain, a nuclear localization signal, or other feature described herein. 

Activation domains. Transcriptional activation domains that may be used in the 
present invention include but are not limited to the Gal4 activation domain from yeast and the 
VP 16 domain from herpes simplex virus. The ability of a domain to activate transcription 
can be validated by fusing the domain to a known DNA binding domain and then 
determining if a reporter gene operably linked to sites recognized by the known DNA- 
binding domain is activated by the fusion protein. 

An exemplary activation domain is the following domain from p65: 

YLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPY 

PFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVP 

VLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSE 

FQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTAQRPPDPAPAPLGAPGLPNGLLSGDEDFSS 
IADMDFSALLSQ (SEQ ID NO: 73) 

The sequence of an exemplary Gal4 activation domain is as follows: 

NFNQSGNIADSSLSFTFTNSSNGPNLITTQTNSQALSQPIASSNVHDNFMNNEITASKIDDGNNSKPL 

SPGWTDQTAYNAFGITTGMFNTTTMDDVYNYLFDDEDTPPNPKKEISMAYPYDVPDYAS (SEO ID 
NO:74) 
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In bacteria, activation domain function can be emulated by a domain that recruits a 
wild-type RNA polymerase alpha subunit C-terminal domain or a mutant alpha subunit C- 
terminal domain, e.g., a C-terminal domain fused to a protein interaction domain. 

Repression domains. If desired, a repression domain instead of an activation domain 
- can be fused to die DNA binding domain. Examples of eukaryotic repression domains 
include repression domains from Kid, UME6, ORANGE, groucho, and WRPW (see, e.g., 
Dawson et al, (1995) Mol Cell Biol. 15:6923-3 1). The ability of a domain to repress 
transcription can be validated by fusing the domain to a known DNA binding domain and 
then determining if a reporter gene operably linked to sites recognized by the known DNA- 
binding domain is repressed by the fusion protein. 

A first exemplary repression domain is the "KRAB" domain from the Kid protein 
(Witzgall R. et al. (1994) Proc. Natl Acad. Sci. U.S.A., 91(10): 4514-8): 

VSVTFEDVAVLFTRDEWKKLDLSQRSLYREVMLENYSNLASMAGFLFTKPKVISLLQQG 
EDPW (SEQ ID NO: 75) 

A second exemplary repression domain is the KOX repression domain. This domain 
includes the "KRAB" domain from the human Koxl protein (Zinc finger protein 10; NCBI 
protein database AAH24182; GI:18848329), i.e., amino acids 2-97 of Koxl: 

LRLEKGEEPWLVEREIHQETHPDSETAFEIKSSV^SEQ ID NO : 72 ^ VSLCYQLTKPBVI 
A third exemplary repression domain is the following domain from UME6 protein: 

NSASSSTKLDDDLGTAAAVTiSNMRSSPYRTHD^ 

G VLRP I LLRI HNSE QQP I FE SNN S T AC I (SEQ ID NO: 119) 

The WRPW domain is still another example of a repression domain. 

Still other chimeric transcription factors include neither an activation or repression 
domain. Rather, such transcription factors may alter transcription by displacing or otherwise 
competing with a bound endogenous transcription factor (e.g., an activator or repressor). 

Other Functional Domains. Examples of other functional domains include a histone 
modifying enzyme (e.g., a histone acetylase or deacetylase), a DNA modifying enzyme (e.g., 
a methylase), and so forth. 

A protein transduction domain can be fused to the zinc finger protein. Protein 
transduction domains result in uptake of the transduction domain and attached polypeptide 
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into cells. A "protein transduction domain" or "PTD" is an amino acid sequence that can 
cross a biological membrane, particularly a cell membrane. When attached to a heterologous 
polypeptide, a PTD can enhance the translocation of the heterologous polypeptide across a 
biological membrane. The PTD is typically covalently attached (e.g., by a peptide bond) to 
5 the heterologous DNA binding domain. For example, the PTD and the heterologous DNA 
binding domain can be encoded by a single nucleic acid, e.g., in a common open reading 
fiame or in one or more exons of a common gene. An exemplary PTD can include between 
10-30 amino acids and may form an amphipathic helix. Many PTD's are basic in character, 
e.g., include at least 4, 5, 6 or 8 basic residues (e.g., arginine or lysine). A PTD may be able 

10 to enhance the translocation of a polypeptide into a cell that lacks a cell wall or a cell from a 
particular species, e.g., a eukaryotic cell, e.g., a vertebrate cell, e.g., a mammalian cell, such 
as a human, simian, murine, bovine, equine, feline, or ovine cell. 

Typically a PTD is linked to a zinc finger protein by producing the DNA binding 
domain^oTthe zinc finger protein and the PTD as a single polypeptide chain, but other 

15 methods of for physically associating a PTD can be used. For example, the PTD can be 
associated by a non-covalent interaction (e.g., using biotin-avidin, coiled-coils, etc.) More 
typically, a PTD can be linked to a zinc finger protein, for example, using a flexible linker. 

Flexible linkers c an include one or more gly ci ne residues to a llo w f or free r o tation For 

example, the PTD can be spaced from a DNA binding domain of the transcription factor by 

20 at least 1 0, 20, or 50 amino acids. A PTD can be located N- or C-terminal relative to a DNA 
binding domain. 

An zinc finger protein can also include a plurality of PTD's, e.g., a plurality of 
different PTD's or at least two copies of one PTD. 

Exemplary PTD's include the following segments from the antennapedia protein, the 
25 herpes simplex virus VP22 protein and HIV TAT protein. 

Tat. The Tat protein from Human Immunodeficiency virus type I (HTV-1) has the 
remarkable capacity to enter cells when added exogenously (Frankel A.D. and Pabo CO. 
(1988) Cell 55:1 189-1 193, Mann D.A and Frankel A.D. (1991) EMBOJ. 10:1733-1739, 
Fawell et al. (1994) Proc. Natl. Acad. Set USA 91 :664-668). The niinimal Tat PTD includes 
30 residues 47-57 of the human immunodeficiency virus Tat protein. This peptide sequence is 
referred to as 'TAT" herein. 
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Antennapedia. The antennapedia homeodomain also includes a peptide that is a 
PTD. Derossi et al. (1994) J. Bio. Chem. 269: 10444-10450. This peptide, also referred to as 
"Penetratin.", 

VP22. The HSVVP22 protein also includes a PTD. This PTD is located at the VP22 
5 C-tenninal 34 amino acid residues. See, e.g., Elliott and OHare (1997) Cell 88:223-234 and 
U.S. 6,184,038. 

Another exemplary PTD is a poly-arginine sequence, e.g., a sequence that includes at 
least 4, 5, 6 or 8 arginine residues, e.g., between 5 and 10 arginine residues. 

Cell-specific PTD's. Some PTD's are specific for particular cell types or states. 

10 One exemplary cell-specific PTD is the Hnl synthetic peptide described in U.S. Published 
Application 2002-0102265. Hnl is internalized by human head and neck squamous 
carcinoma cells and can be used to target an artificial transcription factor to a carcinoma, e.g., 
a carcinoma of the head or neck, or closely related sequences. U.S. Published Application 
2002-0102265 also describes a general method tor using phage display toldentify other 

15 peptides and proteins which can function as cell specific PTD's. For additional information 
about PTD's, see also U.S. 2003-0082561; U.S. 2002-0102265; U.S. 2003-0040038; 
Schwarzeera/. (1999) Science 285:1569-1572; Derossi etal. (1996)J. Biol. Chem. 
271:18188; Hancock et aL(\92l±EAmo J< 10*4033-4039; Buss et al. (1988) Moi^ell. Biol 
8:3960-3963; Derossi et al. (1998) Trends in Cell Biology 8:84-87; Lindgren et al. (2000) 

20 Trends in Pharmacological Sciences 21:99-103; Kilic et al. (2003) Stroke 34:1304-10; Asoh 
etal. (2002) Proc Natl Acad Sci USA 99(26):17107-12; and Tanaka et al. (2003) J Immunol. 
170(3): 1291-8. 

Design of Novel DNA-Binding Proteins 

In one embodiment, a zinc finger protein is rationally designed by mixing and 
25 matching characterized zinc finger domains so that each domain recognizes one segment of 
the target site. Zinc finger domains can be isolated and characterized, e.g., using the methods 
described in US 2002-0061512 and 2003-165997. The modular structure of zinc finger 
domains facilitates their rearrangement to construct new DNA-binding proteins. Zinc finger 
domains in the naturally-occurring Zi£268 protein are positioned in tandem along the DNA 
30 double helix. Each domain independently recognizes a different 3-4 basepair DNA segment 
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A Database of Zinc Finger Domains. The one-hybrid selection system described 
above can be utilized to identify one or more zinc finger domains for each possible 3- or 
4-basepair binding site or a representative number of such binding sites. The results of this 
process can be accumulated as a series of associations between a zinc finger domain and its 

5 preferred 3- or 4-basepair binding site or sites. Examples of such associations are provided 
in US 2002-0061512 and 2003-165997. 

The results can also be stored in a machine as a database, e.g., a relational database, 
spreadsheet, or text file. Each record of such a database associates a representation of a zinc 
finger domain and a string indicating the sequence of the one or more preferred binding sites 

10 of the domain. The database record can include an indication of the relative affinity of the 
zinc finger domains that bind each site. In some implementations, the database record can 
also include information that indicates the physical location of the nucleic acid encoding the 
particular zinc finger domain. Such a physical location can be, for example, a particular well 
of a microtitre plate stored in a freezer. 

15 The database can be configured so that it can be queried or filtered, e.g., using a SQL 

operating environment, a scripting language (such as PERL or a MICROSOFT EXCEL® 
macro), or a programming language. Such a database would enable a user to identify one or 
more zi nc finger d om a i n s t ha t recog niz e s a p a rti c ul ar^ or 4- b a s epa ir b indi ng s ite . Da tabase 
and other information such as can be stored on a database server can also be configured to 

20 communicate with each device using commands and other signals that are interpretable by 
the device. The computer-based aspects of the system can be implemented in digital 
electronic circuitry, or in computer hardware, firmware, software, or in combinations thereof. 
An apparatus of the invention, e.g., the database server, can be implemented in a computer 
program product tangibly embodied in a machine-readable storage device for execution by a 

25 programmable processor; and method actions can be performed by a programmable 
processor executing a program of instructions to perform functions of the invention by 
operating on input data and generating output. One non-limiting example of an execution 
environment includes computers running WINDOWS XP® or WINDOWS NT 4.0® 
(Microsoft, Redmond WA), LINUX™, or other operating systems. 

30 The zinc finger domains can also be tested in the context of multiple different fusion 

proteins to verify their specificity. Moreover, particular binding sites for which a paucity of 
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domains is available can be the target of additional selection screens. Libraries for such 
selections can be prepared by mutagenizing a zinc finger domain that binds a similar yet 
distinct site. A complete matrix of zinc finger domains for each possible binding site is not 
essential, as the domains can be staggered relative to the target binding site in order to best 
utilize the domains available. Such staggering can be accomplished both by parsing the 
binding site in the most useful 3 or 4 basepair binding sites, and also by varying the linker 
length between zinc finger domains. In order to incorporate both selectivity and high affinity 
into the design polypeptide, zinc finger domains that have high specificity for a desired site 
can be flanked by other domains that bind with higher affinity, but lesser specificity. The in 
vivo screening methods described in US 2002-0061512 and 2003-165997 can be used to test 
the in vivo function, affinity, and specificity of an artificially assembled zinc finger protein 
and derivatives thereof. Likewise, these methods can be used to optimize such assembled 
proteins, e.g., by creating libraries of varied linker composition, varied zinc finger domain 
modules, varied zmc finger domain compositions/and so forth. 

Parsing a target site. The target 9-bp or longer DNA sequence is parsed into 3- or 4-bp 
segments. Zinc finger domains are identified (e.g., from a database described above) that 
recognize each parsed 3- or 4-bp segment Longer target sequences, e.g., 20 bp to 500 bp 
_ Sf-oilfinceS- are also suitable tarp-et.q Q frm 1? Vin anH 1 ^ Y\n cnKcAmmnnat. ****** u~ - i:r i 

within them. In particular, subsequences amenable for parsing into sites well represented in the 
database can serve as initial design targets. 

A scoring regime can be used to estimate the probability that a particular chimeric zinc 
finger protein would recognize the target site in the cell. The scores can be a function of each 
component finger's affinity for its preferred subsites, its specificity, and its success in previously 
designed proteins. 

Computer Programs. Computer systems and software can be used to access a 
machine-readable database described above, parse a target site, and output one or more 
chimeric zinc finger protein designs. 

The techniques may be implemented in programs executing on programmable 
machines such as mobile or stationary computers, and similar devices that each include a 
processor, a storage medium readable by the processor, and one or more output devices. 
Each program may be implemented in a high level procedural or object oriented 
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programming language to communicate with a machine system. Some merely illustrative 
examples of computer languages include C, C++, JAVA™, Fortran, and VISUAL BASIC™. 

Each such program may be stored on a storage medium or device, e.g., compact disc 
read only memory (CD-ROM), hard disk, magnetic diskette, or similar medium or device, 
5 that is readable by a general or special purpose programmable machine for configuring and 
operating the machine when the storage medium or device is read by the computer to perform 
the procedures described in this document. The system may also be implemented as a 
machine-readable storage medium, configured with a program, where the storage medium so 
configured causes a machine to operate in a specific and predefined manner. 

10 The computer system can be connected to an internal or external network. For 

example, the computer system can receive requests from a remotely located client system, 
e.g., using HTTP, HTTPS, or XML protocols. The requests can be an identifier for a known 
target gene or a string representing the sequence of a target nucleic acid. In the former case, 
the computer system can access a sequence database such as GENBANK® to retrieve the 

15 nucleic acid sequence of regulatory regions of the target gene. The sequence of the 

regulatory region or the directly-received target nucleic acid sequence is then parsed into 
subsites, and chimeric zinc finger proteins are designed, e.g., as described above. 

Ihe s ystem can com m un i c a te the res idts-to-tfae remotely located client Alternatively, 

the system can control a robot to physically retrieve nucleic acid encoding the chimeric zinc 

20 finger proteins. In this implementation, a library of nucleic acids encoding chimeric zinc 
finger proteins is constructed and stored, e.g., as frozen purified DNA or frozen bacterial 
strains harboring the nucleic acids. The robot responds to signals from the computer system 
by accessing specified addresses of the library. The retrieved nucleic acids can men be 
processed, packaged and delivered to the client. Alternatively, the retrieved nucleic acids can 

25 be introduced into cells and assayed. The computer system can then communicate the results 
of the assay to the client across the network. 

Constructing a Protein from Selected Modules. Once a chimeric polypeptide 
sequence containing multiple zinc finger domains is designed, a nucleic acid sequence 
encoding the designed polypeptide sequence can be synthesized. Methods for constructing 

30 synthetic genes are routine in the art. Such methods include gene construction from custom 
synthesized oligonucleotides, ECR mediated cloning, and mega-primer PCR. In one example, 
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nucleic acids encoding selected zinc finger domains are serially ligated to form a nucleic acid 
encoding a chimeric polypeptide. Additional sequences can be joined to the nucleic acid 
encoding the designed polypeptide sequence. The additional sequence can itself provide 
regulatory functions or can encode an amino acid sequence with a desired function. 

5 Profiling Regulatory Properties of a Chimeric Zinc Finger Protein 

A chimeric zinc finger protein can be characterized to detennine its ability to regulate 
one or more endogenous genes in a cell, e.g., a mammalian cell. Nucleic acid encoding the 
chimeric zinc finger protein is first fused to a repression or activation domain, and then 
introduced into a cell of interest. After appropriate incubation and induction of expression of 
10 the coding nucleic acid, mRNA is harvested from the cell and analyzed using a nucleic acid 
microarray. 

Nucleic acid microarrays can be fabricated by a variety of methods, e.g., 
photol ith ogr a phic methods (see, e.g., U.S. Patent No. 5,510,270), mechan i cal m ethodst^g r 

directed-flow methods as described in U.S. Patent No. 5,384,261), and pin based methods 
15 (e.g., as described in U.S. Pat No. 5,288,5 14). The array is synthesized with a unique 

capture probe at each address, each capture probe being appropriate to detect a nucleic acid 

for a particular expressed gene. 

The mRNA can be isolated by routine methods, e.g., including DNase treatment to 

remove genomic DNA and hybridization to an oligo-dT coupled solid substrate (e.g., as 
20 described in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y). The 

substrate is washed, and the mRNA is eluted. The isolated mRNA is then reversed 

transcribed and optionally amplified, e.g., by rtPCR, e.g., as described in (U.S. Patent No. 

4,683,202). The nucleic acid can be labeled during amplification or reverse transcription, 

e.g., by the incorporation of a labeled nucleotide. Examples of preferred labels include 
25 fluorescent labels, e.g., red-fluorescent dye Cy5 (Amersham) or green-fluorescent dye Cy3 

(Amersham). Alternatively, the nucleic acid can be labeled with biotin, and detected after 

hybridization with labeled streptavidin, e.g., streptavidin-phycoerythrin (Molecular Probes). 
The labeled nucleic acid is then contacted to the array. In addition, a control nucleic 

acid or a reference nucleic acid can be contacted to the same array. The control nucleic acid 
30 or reference nucleic acid can be labeled with a label other than the sample nucleic acid, e.g., 
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one with a different emission maximum. Labeled nucleic acids are contacted to an array 
under hybridization conditions. The array is washed, and then imaged to detect fluorescence 
at each address of the array. 

A general scheme for producing and evaluating profiles is includes detecting 

5 hybridization at each address of the array. The extent of hybridization at an address is 
represented by a numerical value and stored, e.g., in a vector, a one-dimensional matrix, or 
one-dimensional array. The vector x has a value for each address of the array. For example, 
a numerical value for the extent of hybridization at a particular address is stored in variable Xa. 
The numerical value can be adjusted, e.g., for local background levels, sample amount, and 

10 other variations. Nucleic acid is also prepared from a reference sample and hybridized to the 
same or a different array. The vector y is construct identically to vector x. The sample 
expression profile and the reference profile can be compared, e.g., using a mathematical 
equation that is a function of the two vectors. The comparison can be evaluated as a scalar 
value, e.g., a score representing similarity of the two profiles. Either or both vectors can be 

1 5 transformed by a matrix in order to add weighting values to different genes detected by the 
array. 

The expression data can be stored in a database, e.g., a relational database such as a 
SQL database~(e.g.,Xfracle^Qr Sybase database ^wironments). The^atabase can have 
multiple tables. For example, raw expression data can be stored in one table, wherein each 
20 column corresponds to a gene being assayed, e.g., an address or an array, and each row 

corresponds to a sample. A separate table can store identifiers and sample information, e.g., 
the batch number of the array used, date, and other quality control information. 

Genes that are similarly regulated can be identified by clustering expression data to 
identify coregulated genes. Such cluster may be indicative of a set of genes coordinately 
25 regulated by the chimeric zinc finger protein. Genes can be clustered using hierarchical 
clustering (see, e.g., Sokal and Michener (1958) Univ. Kans. Set. Bull 38:1409), Bayesian 
clustering, k-means clustering, and self-organizing maps (see, Tamayo et al (1999) Proc. 
Natl Acad. Set USA 96:2907). 

The similarity of a sample expression profile to a reference expression profile (e g., a 
30 control cell) can also be determined, e.g., by comparing the log of the expression level of the 
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sample to the log of the predictor or reference expression value and adjusting the comparison 
by the weighting factor for all genes of predictive value in the profile. 

Additional Features for Designed Transcription Factors 

Peptide Linkers. DNA binding domains can be connected by a variety of linkers. 
5 The utility and design of linkers are well known in the art. A particularly useful linker is a 
peptide linker that is encoded by nucleic acid Thus, one can construct a synthetic gene that 
encodes a first DNA binding domain, the peptide linker, and a second DNA binding domain. 
This design can be repeated in order to construct large, synthetic, multi-domain DNA binding 
proteins. PCT WO 99/45 132 and Kim and Pabo ((1998) Proc. Natl Acad, Set USA 

10 95:2812-7) describe the design of peptide linkers suitable for joining zinc finger domains. 

Additional peptide linkers are available that form random coil, a-helical or p-pleated 
tertiary structures. Polypeptides that form suitable flexible linkers are well known in the art 

(see, e.g., Robinson and Sauer (1998) Proc Natl Acad Sci USA . 95: 5 9 29 - 34). Flexible — - — 

linkers typically include glycine, because this amino acid, which lacks a side chain, is unique 

15 in its rotational freedom. Serine or threonine can be interspersed in the linker to increase 
hydrophilicity. In additional, amino acids capable of interacting with the phosphate 
backbone of DNA can be utilized in order to increase binding affinit y. Judici ous use of such 
amino acids allows for balancing increases in affinity with loss of sequence specificity. If a 
rigid extension is desirable as a linker, a-helical linkers, such as the helical linker described 

20 in Pantoliano et al (1991) Biochem. 30: 101 17-10125, can be used. Linkers can also be 

designed by computer modeling (see, e.g., U.S. Pat. No. 4,946,778). Software for molecular 
modeling is commercially available (e.g., from Molecular Simulations, Inc., San Diego, CA). 
The linker is optionally optimized, e.g., to reduce antigenicity and/or to increase stability, 
using standard mutagenesis techniques and appropriate biophysical tests as practiced in the 

25 art of protein engineering, and functional assays as described herein. 

For implementations utilizing zinc finger domains, the peptide that occurs naturally 
between zinc fingers can be used as a linker to join fingers together. A typical such naturally 
occurring linker is: Thr-Gly-(Glu or Gln)-(Lys or Arg)-Pro-(Tyr or Phe) (SEQ ID NO:125). 
Dimerization Domains. An alternative method of linking DNA binding domains is 

30 the use of dimerization domains, especially heterodimerization domains (see, e.g., Pomerantz 
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et al (1998) Biochemistry 37:965-970). In this implementation, DNA binding domains are 
present in separate polypeptide chains. For example, a first polypeptide encodes DNA 
binding domain A, linker, and domain B, while a second polypeptide encodes domain C, 
linker, and domain D. An artisan can select a dimerization domain from the many well- 
characterized dimerization domains. Domains that favor heterodimerization can be used if 
homodimers are not desired. A particularly adaptable dimerization domain is the coiled-coil 
motif, e.g., a dimeric parallel or anti-parallel coiled-coil. Coiled-coil sequences mat 
preferentially form heterodimers are also available (Lumb and Kim, (1995) Biochemistry 
34:8642-8648). Another species of dimerization domain is one in which dimerization is 
triggered by a small molecule or by a signaling event For example, a dimeric form of 
FK506 can be used to dimerize two FK506 binding protein (FKBP) domains. Such 
dimerization domains can be utilized to provide additional levels of regulation. 

Functional Assays and Uses 



Zinc finger proteins can be evaluated using cell-free assays and cellular assays. 

1 5 Examples of cell-free assays include assays in which at least partially purified protein is 

evaluated for a biochemical property, e.g., DNA binding in vitro. Examples of useful in vitro 

mchide dectrophoretic mobility shift assays (EMSA), DNA footprinting, DNA 

methylation protection assays, surface plasmon resonance, fluorescence polarization, and 
fluorescence resonance energy transfer (FRET). Binding and other functional properties can be 

20 assayed in cellular assays or in vivo (e.g., in an organism). 

For example, domains can be selected to bind to a target site, e.g., to a promoter site of a 
gene that modulates cell proliferation. By modular assembly, a protein can be designed that 
includes (1) the selected domains that respectively bind to subsites spanning the target promoter 
site, and (2) a transcriptional regulatory domain, e.g., an activation domain or a repression 

25 domain. In an example in which the protein regulates a gene that modulates cell proliferation 
and the protein is intended to counteract cell proliferation, the appropriate transcriptional 
regulatory domain can be chosen depending on whether the gene increases cell proliferation 
(e.g., a repression domain is selected) or decreases cell proliferation (e.g., an activation domain 
is selected). In another example, a library encoding random combinations of zinc finger domains 

30 is screened to identify a chimeric zinc finger protein that alters a phenotype. 
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A nucleic acid sequence encoding a chimeric zinc finger protein can be cloned into an 
expression vector, e.g., an inducible expression vector as described in Kang and Kim, (2000) J 
Biol Chem 275:8742. The inducible expression vector can include an inducible promoter or 
regulatory sequence. Non-limiting examples of inducible promoters include steroid-hormone 
5 responsive promoters (e.g., ecdysone-responsive, estrogen-responsive, and glutocorticoid- 
responsive promoters), the tetracyclin "Tet-On" and "Tet-Off ' systems, and metal-responsive 
promoters. The construct can be transfected into tissue culture cells or into embryonic stem 
cells to generate a transgenic organism as a model subject. The efficacy of the chimeric zinc 
finger protein can be determined by inducing expression of the protein and assaying cell 

10 proliferation of the tissue culture cell or assaying for developmental changes and/or tumor 
growth in a transgenic animal model. In addition, the level of expression of the gene being 
targeted can be assayed by routine methods to detect mRNA, e.g., RT-PCR or Northern blots. 
A more complete diagnostic includes purifying mRNA from cells expressing and not expressing 
the chimeric zinc finger protein. The two pools ot mRNA are used to probe a microarray 

1 5 containing probes to a large collection of genes, e.g., a collection of genes relevant to the 

condition of interest (e.g., cancer) or a collection of genes identified in the organism's genome. 
Such an assay is particularly valuable for deterrnining the specificity of the chimeric zinc finger 

protein. If the protein binds with high affinity but little specifiHty, it may ™™ r Hotropic and 

undesirable effects by affecting expression of genes in addition to the contemplated target. Such 

20 effects are revealed by a global analysis of transcripts. 

In addition, the chimeric zinc finger protein can be produced in a subject cell or subject 
organism in order to regulate an endogenous gene. The chimeric zinc finger protein is 
configured, as described above, to bind to a region of the endogenous gene and to provide a 
transcriptional activation or repression function. As described in Kang and Kim (supra), the 

25 expression of a nucleic acid encoding the chimeric zinc finger protein can be operably linked to 
a regulatable promoter (e.g., an inducible or suppressible promoter). By modulating the 
concentration of an agent that can regulate the promoter, e.g., an inducer for the promoter, the 
expression of the endogenous gene can be regulated in a concentration dependent manner. 

The binding site preference of a zinc finger protein can be verified by a biochemical 

30 assay such as EMS A, DNase footprinting, surface plasmon resonance, SELEX, or column 
binding. The substrate for binding can be, e.g., a synthetic oligonucleotide encompassing the 
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target site or a restriction fragment The assay can also include non-specific DNA as a 
competitor, or specific DNA sequences as a competitor. Specific competitor DNAs can 
include the recognition site for DNA binding with one, two, or three nucleotide mutations. 
Thus, a biochemical assay can be used to measure not only the affinity of a domain for a 
given site, but also its affinity to the site relative to other sites. Rebar and Pabo, (1994) 
Science 263:671-673 describe a method of obtaining apparent K, constants for zinc finger 
domains from EMS A. Exemplary zinc finger proteins have at least 2, 5, 10, 50, 100, or 500 
fold preference for a particular recognition site relative to a related site with one, two, or 
three nucleotide mutations. 

A protein or nucleic acid described herein can also be evaluated, e.g., in vitro or in 
vivo for a biological activity, e.g., ability to modulate a endothelial cell or to modulate 
angiogenesis. 

Endothelial cell proliferation. A protein or nucleic acid can be tested for 
cndomelial proiiferanon inhibiting activity using a biological activity assay such as the 
bovine capillary endothelial cell proliferation assay, the chick CAM assay, the mouse corneal 
assay, and evaluating the effect of the protein or nucleic acid being tested on implanted 
tumors. The chick CAM assay is described, e.g., by O'Reilly, et al. in "Angiogenic 
Regulation of Metastatic. Growth" Cell vol 19 (7\ fVt 21, iQQ^r r ?15 32 8. B riefl y, 
three-day old chicken embryos with intact yolks are separated from the egg and placed in a 
petri dish. After three days of incubation a methylceUulose disc containing the protein to be 
tested is applied to the CAM of individual embryos. Alter 48 hours of incubation, the 
embryos and CAMs are observed to determine whether endothelial growth has been inhibited. 
The mouse corneal assay involves implanting a growth fec*or-contaimng pellet, along with 
another pellet containing the suspected endothelial growth inhibitor, in the cornea of a mouse 
25 and observing the pattern of capillaries that are elaborated in the cornea. 

Angiogenesis. Angiogenesis may be assayed , e.g., using various human endothelial 
cell systems, such as umbilical vein, coronary artery, or dermal cells. Suitable assays include 
Alamar Blue based assays (available from Biosource International) to measure proliferation; 
migration assays using fluorescent molecules, such as the use of Becton Dickinson Falcon 
HTS FluoroBlock cell culture inserts to measure migration of cells through membranes in 
presence or absence of angiogenesis enhancer or suppressors; and tubule formation assays 
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based on the formation of tubular structures by endothelial cells on Matrigel™(Becton 
Dickinson). 

Cell adhesion. Cell adhesion assays measure adhesion of cells to purified adhesion 
proteins or adhesion of cells to each other, in presence or absence of the protein or nucleic 
5 acid being tested. Cell-protein adhesion assays measure the ability of agents to modulate the 
adhesion of cells to purified proteins. For example, recombinant proteins are produced, 
diluted to 2.5 g/mL in PBS, and used to coat the wells of a microtiter plate. The wells used 
for negative control are not coated. Coated wells are then washed, blocked with 1% BSA, 
and washed again. Compounds are diluted to 2 times final test concentration and added to the 

10 blocked, coated wells. Cells are then added to the wells, and the unbound cells are washed off. 
Retained cells are labeled directly on the plate by adding a membrane-permeable fluorescent 
dye, such as calcein-AM, and the signal is quantified in a fluorescent microplate reader. 

Cell-cell adhesion assays can be used to measure the ability of the protein or nucleic 
acid bemg tested to modulate binding of cells to eachother. These assays can use cells that 

1 5 naturally or recombinant^ express an adhesion protein of choice. In an exemplary assay, 
cells expressing the cell adhesion protein are plated in wells of a multiwell plate together 
with other cells (either more of the same cell type, or another type of cell to which the cells 

— ad here) . Th e cells tha t c an adhere are labeled wife a membrane - permeable fluorescent dy e; 

such as BCECF, and allowed to adhere to the monolayers in the presence of the protein or 

20 nucleic acid being tested. Unbound cells are washed off, and bound cells are detected using a 
fluorescence plate reader. High-throughput cell adhesion assays have also been described 
See, e.g., Falsey J R et al., Bioconjug Chem. May- June 2001;12(3):346-53. 

Tubulogenesis. Tubulogenesis assays can be used to monitor the ability of cultured 
cells, generally endothelial cells, to form tubular structures on a matrix substrate, which 

25 generally simulates the environment of the extracellular matrix. Exemplary substrates 
include Matrigel™ (Becton Dickinson), an extract of basement membrane proteins 
containing laminin, collagen IV, and heparin sulfate proteoglycan, which is liquid at 4°C. and 
forms a solid gel at 37°C. Other suitable matrices comprise extracellular components such as 
collagen, fibronectin, and/or fibrin. Cells are stimulated with a pro-angiogenic stimulant, and 

30 their ability to form tubules is detected by imaging. Tubules can generally be detected after 
an overnight incubation with stimuli, but longer or shorter time frames may also be used. 
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Tube formation assays are well known in the art (e.g., Jones M K et al., 1999, Nature 
Medicine 5:1418-1423). These assays have traditionally involved stimulation with serum or 
with the growth factors FGF or VEGF. In one embodiment, the assay is performed with cells 
cultured in serum free medium. In one embodiment, the assay is performed in the presence 
5 of one or more pro-angiogenic agents, e.g, inflammatory angiogenic factors such as TNF-a, 
or FGF, VEGF, phorbol myristate acetate (PMA), TNF-alpha, ephrin, etc. 

Cell Migration. An exemplary assay for endothelial cell migration is the human 
microvascular endothelial (HMVEC) migration assay. See, e.g., Tolsma et al. (1993) J. Cell 
Biol 122, 497-51 1. Migration assays are known in the art (e.g., Paik J H et al., 2001, J Biol 

10 Chem 276: 1 1 830-1 1 837). In one example, cultured endothelial cells are seeded onto a 
matrix-coated porous lamina, with pore sizes generally smaller than typical cell size. The 
lamina is typically a membrane, such as the transwell polycarbonate membrane (Coining 
Costar Corporation, Cambridge, Mass.), and is generally part of an upper chamber that is in 
fluid contact with a lower chamber containing pro-angiogenic stimuli. Migration is generally 

15 assayed after an overnight incubation with stimuli, but longer or shorter time frames may also 
be used. Migration is assessed as the number of cells that crossed the lamina, and may be 
detected by staining cells with hemotoxylin solution (VWR Scientific), or by any other 

method for determ ini ng c ell number. In another exemplary set up, cells are fluo r escen t ly ~ 

labeled and migration is detected using fluorescent readings, for instance using the Falcon 

20 HTS FluoroBlok (Becton Dickinson). While some migration is observed in the absence of 
stimulus, migration is greatly increased in response to pro-angiogenic factors. The assay can 
be used to test the effect of the protein or nucleic acid being tested on endothelial cell 
migration. 

Sprouting assay. An exemplary sprouting assay is a three-dimensional in vitro 
25 angiogenesis assay that uses a cell-number defined spheroid aggregation of endothelial cells 
("spheroid"), embedded in a collagen gel-based matrix. The spheroid can serve as a starting 
point for the sprouting of capillary-like structures by invasion into the extracellular matrix 
(termed "cell sprouting") and the subsequent formation of complex anastomosing networks 
(Korff and Augustin, 1999, J Cell Sci 112:3249-58). In an exemplary experimental set-up, 
30 spheroids are prepared by pipetting 400 human umbilical vein endothelial cells into 

individual wells of a nonadhesive 96-well plates to allow overnight spheroidal aggregation 
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(Korff and Augustin: J Cell Biol 143: 1341-52, 1998). Spheroids are harvested and seeded in 
900 \xl of methocel-collagen solution and pipetted into individual wells of a 24 well plate to 
allow collagen gel polymerization. Test agents are added after 30 min by pipetting 100 jil of 
10-fold concentrated working dilution of the test substances on top of the gel. Plates are 
5 incubated at 37°C for 24 h. Dishes are fixed at the end of the experimental incubation period 
by addition of paraformaldehyde. Sprouting intensity of endothelial cells can be quantitated 
by an automated image analysis system to determine the cumulative sprout length per 
spheroid. 

Other exemplary assays include: Ferrara and Henzel (1989) Nature 380:439-443; 
10 Gospodarowicz et al. (1989) Proc. Natl. Acad. Sci. USA, 86: 73 1 1-73 15; and Claffey et al. 
(1995) Biochim. Biophys. Acta. 1246:1-9. ;Leung et al. (1989) Science 246:1306-1309; 
Rastinejad et al. (1989) Cell 56:345-355; and US 5,840,693. The ability of a composition to 
modulate ischemia can be evaluated, e.g., using a rat hindlimb ischemia model (see, e.g., 
Takeshita, S. et al., Circulation (1998) 98: 1261-63. 

15 

Targets for Gene Regulation 

The target gene can be any gene, e.g . , a ch romosomal gene or a heternlngmig gpne 

(e.g., a transgene). The target gene can be selected, e.g., if it is useful to regulate (e.g., 
increase or decrease) activity of the target gene. For example, a gene required by a pathogen 

20 can be repressed, a gene required for cancerous growth can be repressed, a gene poorly 
expressed or encoding an unstable protein can be activated and overexpressed, a gene that 
confers stress resistance can be activated, and so forth. 

Examples of specific target genes include genes that encode: cell surface proteins 
(e.g., glycosylated surface proteins), cancer-associated proteins, cytokines, chemokines, 

25 peptide hormones, neurotransmitters, cell surface receptors (e.g., cell surface receptor kinases, 
seven transmembrane receptors, virus receptors and co-receptors, extracellular matrix 
binding proteins, cell-binding proteins, antigens of pathogens (e.g., bacterial antigens, 
malarial antigens, and so forth). Additional protein targets include enzymes such as enolases, 
cytochrome P450s, acyltransferases, methylases, TIM barrel enzymes, isomerases, acyl 

30 transferases, and so forth. 
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StiU more examples include: integrins, cell attachment molecules or "CAMs" such as 
cadherins, selections, N-CAM, E-CAM, U-CAM, I-CAM and so forth); proteases (e.g., 
subtilisin, trypsin, chymotrypsin; aplasminogen activator, such as urokinase or human tissue- 
type plasminogen activator); bombesin; factor DC, thrombin; CD-4; plateiet-derived growth 
5 factor, insulin-like growth factor-I and -II; nerve growth factor, fibroblast growth factor (e.g., 
aFGF and bFGF); epidermal growth factor (EOF); VEGF (e.g., VEGF-A); transforming 
growthfactor(TGF,e.g.,TGF-aandTGF-£i ; insulin-like growth factor bmding proteins; 
erythropoietin; thrombopoietin; mucins; human serum albumin; growth hormone (e.g., 
human growth hormone); proinsulin, insulin A-chain insulin B-chain; parathyroid hormone; 
10 thyroid stimmatmghoimone;myroxme;fomclesm^ 

natriuretic peptides A, B or C; leutinizing hormone; glucagon; factor VHI; hemopoietic 
growth factor, tumor necrosis factor (e.g., TNF-a and TNF-P); enkephalinase; Mullerian- 
inhi biting substance; gonadotropin-associated peptide; tissue factor protein; inbibin; activin: 
V3SCular end °fcelial growth factor, receptors for hormones or growth factors; rheumatoid 
15 factors; osteoinductive factors; an interferon, e.g., interferon-a,p,y; colony stimulating factors 
(CSFs), e.g., M-CSF, GM-CSF, and G-CSF; interleukins (ILs), e.g., IL-1, IL-2, IL-3, IL-4, 
etc.; decay accelerating factor; and immunoglobulins. In some embodiments, the targetgene 

encodes a protein o r oth er factoi (e.g., an RNA) that is associated with a disease, e.g., cancer, 

an infectious disease, inflammation, or a cardiovascular disease. 
20 In one embodiment, the gene is a human disease gene. For example, the gene can 

include a mutation that encodes a defective or impaired enzyme or the gene may have a 
defect in a regulatory sequence (e.g., a transcriptional, translational, or splicing regulatory 
sequence). A zinc finger protein can be obtained that increases expression of the gene. 

For example, zinc finger proteins can be designed that interact with a FGF gene, e.g., 
25 to a binding site in the sequence listed in FIG. 2A-F, or with a hepatocyte growth factor 
(HGF) gene, e.g., to a binding site in the sequence listed in FIG. 3 A-E. For example, the 
proteins may interact with a promoter region of these genes. 

A chimeric zinc finger protein for regulating any gene can be designed to interact 
with one or more target sites. For example, the target sites can be located in a coding or non- 
30 coding region of the gene. In one embodiment, the target site is located in a regulatory 

region, e.g., a transcriptional regulatory region such as the promoter. In one embodiment, the 
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target site is located within 700, 500, 300, 200, 50, 20, 10, 5, or 3 basepairs of the 
transcription start site, a Dnase hypersensitive site, or a transcription factor binding site. In 
an embodiment in which the target gene is VEGF-A, the binding site can differ from (e.g., 
not overlap with) a site in Table 2 or 3 of WO 02/46412. In another embodiment, the binding 
5 site does overlap with such a site. 

Gene and Cell-based Therapeutics 

DNA molecules that encode a chimeric zinc finger protein can be inserted into a 
variety of DNA constructs and vectors for the purposes of gene therapy. As used herein, a 

10 'Vector*' is a nucleic acid molecule competent to transport another nucleic acid molecule to 
which it has been covalently linked. Vectors include plasmids, cosmids, artificial 
chromosomes, viral elements, and RNA vectors (e.g., based on RNA virus genomes). The 

vector can be competent to replicate in a host cell or t oJnte gratp . into a ho st DMA Vira l 

vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated 

15 viruses. 

A gene therapy vector is a vector designed for administration to a subject, e.g., a 
mammal, such that a cell of the subject is able to express a therapeutic gene contained in the 
vector. The gene therapy vector can contain regulatory elements, e.g., a 5' regulatory 
element, an enhancer, a promoter, a 5* untranslated region, a signal sequence, a 3' 

20 untranslated region, a polyadenylation site, and a 3* regulatory region. For example, the 5' 
regulatory element, enhancer or promoter can regulate transcription of the DNA encoding the 
therapeutic polypeptide. The regulation can be tissue specific. For example, the regulation 
can restrict transcription of the desired gene to brain cells, e.g., cortical neurons or glial cells; 
hematopoietic cells; or endothelial cells. Alternatively, regulatory elements can be included 

25 that respond to an exogenous drug, e.g., a steroid, tetracycline, or the like. Thus, the level 
and timing of expression of the therapeutic zinc finger protein (e.g., a polypeptide that 
regulates VEGF) can be controlled. 

Gene therapy vectors can be prepared for delivery as naked nucleic acid, as a 
component of a virus, or of an inactivated virus, or as the contents of a liposome or other 

30 delivery vehicle. See, e.g., US 2003-0143266 and 2002-0150626. In one embodiment, the 



39- 



WO 2004/053130 PCT/KR2003/002693 



nucleic acid is formulated in a lipid-protein-sugar matrix to form microparticles., e.g., having 
a diameter between 50 nm to 10 micrometers. The particles may be prepared using any 
known lipid (e.g., dipalmitoylphosphatidylcholine, DPPC), protein (e.g., albumin), or sugar 
(e.g., lactose). 

5 The gene therapy vectors can be delivered using a viral system. Exemplary viral 

vectors include vectors from retroviruses, e.g., Moloney retrovirus, adenoviruses, adeno- 
associated viruses, and lentiviruses, e.g., Herpes simplex viruses (HSV). HSV, for example, 
is potentially useful for infecting nervous system cells. See, e.g., US 2003-0147854, 
2002-0090716, 2003-0039636, 2002-0068362, and 2003-0104626. The gene delivery agent, 
10 e.g., a viral vector, can .be produced from recombinant cells which produce the gene delivery 
system, 

A gene therapy vector can be administered to a subject, for example, by intravenous 
injection, by local administration (see U.S. Patent 5,328,470) or by stereotactic injection (see 
e.g., Chen et al. (1994) Proa Natl Acad. Set USA 91 :3054-3057). The gene therapy agent 

15 can be further formulated, for example, to delay or prolong the release of the agent by means 
of a slow release matrix. One method of providing a recombinant zinc finger protein, is by 
inserting a gene therapy vector into bone marrow cells harvested from a subject. The cells 

are-in f e cted, for example, with a retroviral g e ne the r apy vector, and g r own in culture. 

Meanwhile, the subject is irradiated to deplete the subject of bone marrow cells. The bone 

20 marrow of the subject is then replenished with the infected culture cells. The subject is 
monitored for recovery and for production of the therapeutic polypeptide. 

Cell based-therapeutic methods include introducing a nucleic acid that encodes the 
chimeric zinc finger protein operably linked to a promoter into a cell in culture. The 
chimeric zinc finger protein can be selected to regulate an endogenous gene in the culture cell 

25 or to produce a desired phenotype in the cultured cell. Further, it is also possible to modify 
cells, e.g., stem cells, using nucleic acid recombination, e.g., to insert a transgene, e.g., a 
transgene encoding a chimeric zinc finger protein that regulates an endogenous gene. The 
modified stem cell can be administered to a subject. Methods for cultivating stem cells in 
vitro are described, e.g., in US Application 2002-0081724. In some examples, the stem cells 

30 can be induced to differentiate in the subject and express the transgene. For example, the 
stem cells can be differentiated into liver, adipose, or skeletal muscle cells. The stem cells 
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can be derived from a lineage that produces cells of the desired tissue type, e.g., liver, 
adipose, or skeletal muscle cells. 

In another embodiment, recombinant cells that express or can express a chimeric zinc 
finger protein, e.g., as described herein, can be used for replacement therapy in a subject. For 

5 example, a nucleic acid encoding the chimeric zinc finger protein operably linked to a 

promoter (e.g., an inducible promoter, e.g., a steroid hormone receptor-regulated promoter) is 
introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The 
cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, 
and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. BiotechnoL 

10 14:1 107; Joki et al. (2001) Nat. BiotechnoL 19:35; and U.S. 5,876,742. Other examples of 
biocompatible polymers for encapsulating cells include sodium alginate, barium alginate or 
sodium cellulose sulfate. Useful polymers enable proteins (e.g., proteins less than 70, 20, or 
10 kDa) to diffuse across them. Ultra-pure materials can improve the viability of 
encapsulated cells and reduce immunological!- eactions. Encapsulated cells, e.g., cells that 

15 include an artificial transcription factor and can produce a diffusible factor can be used as a 
therapy in a subject to provide the diffusible factor to the subject. 

One exemplary method for encapsulating cells and tissues involves the use of 

coatings formed of a non - fibrogenic alginate, a g e latinous substance that can be derived from 

certain kinds of kelp. For example, the cells are suspended in a viscous, liquid alginate, 

20 which is then atomized by any of a number of different arrangements into droplets of suitable 
size to encapsulate the cells. Once the droplets come into contact with a gelling solution, 
such as calcium chloride or barium chloride, a single layer alginate coating is created around 
the cells. 

Examples of this approach for creating single layer alginate coatings using an 
25 electrostatic coating process are shown in US 4,789,550, US 4,956,128, US 5,429,821, 
US 5,639,467, US 5,656,468 and US 5,693,514. An example for creating a single layer 
alginate coating using an air knife process is shown in US 5,521,079. A pressurized process 
for coating droplets is described in US 5,260,002 and US 5,462,866. Other examples for 
creating a single layer alginate coating using a spinning disk arrangement are shown in 
30 US 5,643,594 and US 6,001,387. Examples for creating a single layer alginate coating using 
a piezoelectric nozzle are shown in US 5,286,496, US 5,648,099 and US 6,033,888. 
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US 5,470,73 1 and US 5,53 1,997 describe a double layer coating for tissue that comprises a 
first layer of a gel-able organic polymer and a cationic polymer and a second water-soluble, 
semi-permeable layer chemically bonded to the first layer. US 6,020,200 describes a dual 
layer coating having a stabilized outer layer formed of a cross-linked polymer matrix. 

5 US 5,227,298 (Weber at al.) describes a double walled alginate coating. 

Encapsulated cells can be implanted by surgery (e.g., laproscopic or conventional 
surgical methods) or by injection. Cells can be introduced into any appropriate body site 
including the liver, spleen, thymus, testes, brain, pancreas, lungs, kidneys, peritoneal cavity, 
subcutaneous tissues, fat pads and other locations. See, e.g., J. Rozga et al., Intraabdominal 

10 Organ Transplantation 2000; R. G. Landes Co., USA, 1994: 129. 

In implementations where the chimeric zinc finger protein regulates an endogenous 
gene that encodes a secreted protein, production of the secreted polypeptide can be regulated 
in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another 
embodiment, production of the zinc finger protein can be placed under control of an 

15 endogenous signal, e.g., a signal indicating reduced level of the secreted protein. Thus, an 
artificial feedback loop can be used. For example, the signal can be mediated by a 
transcription factor that is regulated by level of the secreted protein itself. 

For add ition al method s for encapsulating cells, see, for example: U.S. 4 ,391,909; 

US 2002-0022016; Lohr et al., (2002) Cancer Chemother Pharmacol, 49: S21-S24; Hobbs et 

20 al., (2001) Journal of Investigative Medicine, vol.49, no.6, 49(6):572-5; Zimmermann et al. 
(2001) Ann N YAcad Sci. 2001; Moashebi et al; Tissue Engineering, 2001, vol.7, 5, 525- 
534); Orive et al., (2002) Trends in Biotechnology, vol.20, 382-7; Lim and Sun (1980) 
Science 210: 908-910; Reed et al. 2001. Nature Biotech 19:29-34; Domish et al., (2001) 
"Standards and guidelines for Biopolymers in Tissue-Engineered Medical Products: ASTM 

25 Alginate and Chitosan Standard Guides." Arm N YAcad Sci. 2001 ; 944:388-97. 

In still another embodiment, the recombinant cells that express or can express a 
chimeric zinc finger protein are cultivated in vitro. A protein produced by the recombinant 
cells can be recovered (e.g., purified) from the cells or from media surrounding the cells. In 
another example the recombinant cells are used as feeder cells. 
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Pharmaceutical Compositions 

In another aspect, the invention provides compositions, e.g., phaimaceutically 
acceptable compositions, which include an zinc finger protein or a nucleic acid encoding it, 
e.g., a molecule described herein, formulated together with a pharmaceutically acceptable 
5 carrier. 

As used herein, "pharmaceutically acceptable carrier" includes any and all solvents, 
dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption 
delaying agents, and the like that are physiologically compatible. Preferably, the carrier is 
suitable for intravenous, intramuscular, subcutaneous, parenteral, spinal or epidermal 
10 administration (e.g., by injection or infusion). Depending on the route of administration, the 
active compound may be coated in a material to protect the compound from the action of 
acids and other natural conditions that may inactivate the compound. 

A "pharmaceutically acceptable salt" refers to a salt that retains the desired biological 
activity of the parent compound and does not impart any undesired toxicological effects (see 
15 e.g., Berge, S.M, et ah (1977) J. Pharm. Set 66:1-19)- Examples of such salts include acid 
addition salts and base addition salts. Acid addition salts include those derived from 
nontoxic inorganic acids, such as hydrochloric, nitric, phosphoric, sulfuric, hydrobromic, 

hydroiodic, phosphorous and the like, as well as from nontoxic organic acids such as 

aliphatic mono- and dicarboxylic acids, phenyl-substituted alkanoic acids, hydroxy alkanoic 
20 acids, aromatic acids, aliphatic and aromatic sulfonic acids and the like. Base addition salts 
include those derived from alkaline earth metals, such as sodium, potassium, magnesium, 
calcium and the like, as well as from nontoxic organic amines, such as N,1SP- 
dibenzylethylenediamine, N-methylglucamine, chloroprocaine, choline, diethanolamine, 
ethylenediamine, procaine and the like. 
25 The compositions of this invention may be in a variety of forms. These include, for 

example, liquid, semi-solid and solid dosage forms, such as liquid solutions (e.g., injectable 
and infusible solutions), dispersions or suspensions, tablets, pills, powders, liposomes and 
suppositories. The preferred form depends on the intended mode of administration and 
therapeutic application. Exemplary compositions are in the form of injectable or infusible 
30 solutions. One mode of administration is parenteral (e.g. 9 intravenous, subcutaneous, 

intraperitoneal, intramuscular). In one embodiment, the composition that includes the zinc 
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finger protein or a nucleic acid encoding it is administered by intravenous infusion or 
injection. In another embodiment, the composition that includes the zinc finger protein or a 
nucleic acid encoding it is administered by intramuscular or subcutaneous injection. 

The phrases "parenteral acmiinistration" and "administered parente^ally' , as used 
herein means modes of administration other than enteral and topical administration, usually 
by injection, and includes, without limitation, intravenous, intramuscular, intraarterial, 
intrathecal, intracapsular, intraorbital, intracardiac, intradermal, intraperitoneal, transtracheal, 
subcutaneous, subcuticular, intraarticular, subcapsular, subarachnoid, intraspinal, epidural 
and intrasternal injection and infusion. 

Pharmaceutical compositions typically must be sterile and stable under the conditions 
of manufacture and storage. Endotoxin levels in the preparation can be tested using the 
Limulus amebocyte lysate assay (e.g., using the kit from Bio Wbittaker lot # 7L3790, 
sensitivity 0.125 EU/mL) according to the USP 24/NF 19 methods. Sterility of 
pharmaceutical compositions can be determined using thioglycollate medium according to 



15 the USP 24/NF 19 methods. For example, the preparation is used to inoculate the 

thioglycollate medium and incubated at 35°C for 14 or more days. The medium is inspected 
periodically to detect growth of a microorganism. 

The composition c an be formulated aa a solution, microemulsiou, dispersion, 

liposome, or other ordered structure suitable to high drug concentration. Sterile injectable 

20 solutions can be prepared by incorporating the active compound (Le., the zinc finger protein 
or a nucleic acid encoding it) in the required amount in an appropriate solvent with one or a 
combination of ingredients enumerated above, as required, followed by filtered sterilization. 
Generally, dispersions are prepared by incorporating the active compound into a sterile 
vehicle that contains a basic dispersion medium and the required other ingredients from those 

25 enumerated above. In the case of sterile powders for the preparation of sterile injectable 
solutions, the preferred methods of preparation are vacuum drying and freeze-drying that 
yields a powder of the active ingredient plus any additional desired ingredient from a 
previously sterile-filtered solution thereof. The proper fluidity of a solution can be 
maintained, for example, by the use of a coating such as lecithin, by the maintenance of the 

30 required particle size in the case of dispersion and by the use of surfactants. Prolonged 
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absorption of injectable compositions can be brought about by including in the composition 
an agent that delays absorption, for example, monostearate salts and gelatin. 

A composition that includes a zinc finger protein or a nucleic acid encoding it can be 
administered by a variety of methods known in the art. For many applications, the 
5 route/mode of administration is intravenous injection or infusion. For example, for 

therapeutic applications, the composition that includes a zinc finger protein or a nucleic acid 
encoding it can be administered by intravenous infusion at a rate of less than 30, 20, 10, 5, or 
1 mg/min to reach a dose of about 1 to 100 mg/m 2 or 7 to 25 mg/m 2 . The route and/or mode 
of administration will vary depending upon the desired results. In certain embodiments, the 

10 active compound may be prepared with a carrier that will protect the compound against rapid 
release, such as a controlled release formulation, including implants, and microencapsulated 
delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene 
vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic 
acid. Many methods for the preparation ot such formulations are patented or generally 

1 5 known. See, e.g. , Sustained and Controlled Release Drug Delivery Systems, J.R. Robinson, 
ed., Marcel Dekker, Inc., New York, 1978. 

In certain embodiments, the composition may be orally administered, for example, 

ari t h a n inert diluent or a n a s similable edible carrier. The compound (and othe r i ng r edie n ts, — 

if desired) also may be enclosed in a hard or soft shell gelatin capsule, compressed into 

20 tablets, or incorporated directly into the subject's diet For oral therapeutic administration, 
the compounds may be incorporated with excipients and used in the form of ingestible tablets, 
buccal tablets, troches, capsules, elixirs, suspensions, syrups, wafers, and the like. To 
administer a compound described herein by other than parenteral administration, it may be 
necessary to coat the compound with, or co-administer the compound with, a material to 

25 prevent its inactivation. 

Pharmaceutical compositions can be administered with medical devices known in the 
art. For example, in a preferred embodiment, a pharmaceutical composition described herein 
can be administered with a needle-less hypodermic injection device, such as the devices 
disclosed in U.S. Patent Nos. 5,399,163, 5,383,851, 5,312,335, 5,064,413,4,941,880, 

30 4,790,824, or 4,596,556. Examples of well-known implants and modules useful in the 

invention include: U.S. Patent No. 4,487,603, which discloses an implantable micro-infusion 
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pump for dispensing medication at a controlled rate; U.S. Patent No. 4,486,194, which 
discloses a therapeutic device for administering medicaments through the skin; U.S. Patent 
No. 4,447,233, which discloses a medication infusion pump for delivering medication at a 
precise infusion rate; U.S. Patent No. 4,447,224, which discloses a variable flow implantable 
5 infusion apparatus for continuous drug delivery; U.S. Patent No. 4,439,196, which discloses 
an osmotic drug delivery system having multi-chamber compartments; and U.S. Patent 
No. 4,475,196, which discloses an osmotic drug delivery system. Of course, many other 
such implants, delivery systems, and modules also are known. 

In certain embodiments, the compounds described herein can be formulated to ensure 

10 proper distribution in vivo. For example, the blood-brain barrier (BBB) excludes many 

highly hydrophilic compounds. To ensure that a therapeutic can cross the BBB (if desired), it 
can be formulated, for example, in a liposome. For methods of manufacturing liposomes, see, 
e.g., U.S. 4,522,811; 5,374,548; and 5,399,331. The liposomes may include one or more 
-moieties which are selectively transported into specific cells or organs, thus enhance targeted 

15 drug delivery (see, e.g., V.V. Ranade (1989) J. Clin. Pharmacol 29:685). 

Dosage regimens are adjusted to provide the optimum desired response (e.g., a 
therapeutic response). For example, a single bolus may be administered, several divided 

doses may b e administered over t i me or the dose may be proportionally reduced or increased — 

as indicated by the exigencies of the therapeutic situation. It is especially advantageous to 

20 formulate parenteral compositions in dosage unit form for ease of administration and 

uniformity of dosage. Dosage unit form as used herein refers to physically discrete units 
suited as unitary dosages for the subjects to be treated; each unit contains a predetermined 
quantity of active compound calculated to produce the desired therapeutic effect in 
association with the required pharmaceutical carrier. The specification for the dosage unit 

25 forms can be dictated by and directly dependent on (a) the unique characteristics of the active 
compound and the particular therapeutic effect to be achieved, and (b) the limitations 
inherent in compounding such an active compound for the treatment of sensitivity in 
individuals. 

An exemplary, non-limiting range for a therapeutically or prophylactically effective 
30 amount of a composition described herein is 0.1-20 mg/kg, more preferably 1-10 mg/kg. 

The composition can be administered by intravenous infusion at a rate of less than 30, 20, 10, 
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5, or 1 mg/min to reach a dose of about 1 to 100 mg/m 2 or about 5 to 30 mg/m 2 . It is to be 
noted that dosage values may vary with the type and severity of the condition to be alleviated. 
It is to be further understood that for any particular subject, specific dosage regimens should 
be adjusted over time according to the individual need and the professional judgment of the 
5 person administering or supervising the administration of the compositions, and that dosage 
ranges set forth herein are exemplary only and are not intended to limit 

A pharmaceutical composition may include a "therapeutically effective amount" or a 
"prophylactically effective amount" of a zinc finger protein or a nucleic acid encoding it, e.g., 
a protein or nucleic acid described herein. A "therapeutically effective amount" refers to an 

10 amount effective, at dosages and for periods of time necessary, to achieve the desired 

therapeutic result A therapeutically effective amount of the composition may vary according 
to factors such as the disease state, age, sex, and weight of the individual, and the ability of 
the protein to elicit a desired response in the individual. A therapeutically effective amount 
is also one in which any toxic or detrimental effects of the composition are outweighed by 

15 the therapeutically beneficial effects. A 'therapeutically effective dosage" preferably inhibits 
a measurable parameter, e.g., inflammation or tumor growth rate by at least about 20%, more 
preferably by at least about 40%, even more preferably by at least about 60%, and still more 

prefe r ab l y by a t . least a bout 80% rel a tive to untreated subjects. Th e ability of a compound to 

inhibit a measurable parameter, e.g., cancer, can be evaluated in an animal model system 

20 predictive of efficacy in human tumors. Alternatively, this property of a composition can be 
evaluated by examining the ability of the compound to inhibit, such inhibition in vitro by 
assays known to the skilled practitioner. 

A "prophylactically effective amount* * refers to an amount effective, at dosages and 
for periods of time necessary, to achieve the desired prophylactic result. Typically, since a 

25 prophylactic dose is used in subjects prior to or at an earlier stage of disease, the 

prophylactically effective amount will be less than the therapeutically effective amount. 

Also within the scope of the invention are kits including the zinc finger protein or a 
nucleic acid that encodes it and instructions for use, e.g., treatment, prophylactic, or 
diagnostic use. In an embodiment in which the zinc finger protein regulates the VEGF-A 

30 gene, the instructions for therapeutic applications include suggested dosages and/or modes of 
administration in a patient with a cancer or neoplastic disorder, or angiogenesis related 
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disorder (e.g., certain inflammatory disorders). The kit can further contain a least one 
additional reagent, such as a diagnostic or therapeutic agent, e.g., a diagnostic or therapeutic 
agent as described herein, and/or one or more additional zinc finger proteins or nucleic acids, 
formulated as appropriate, in one or more separate pharmaceutical preparations. 

5 Treatments 

Zinc finger proteins that can regulate an endogenous gene, particularly proteins that 
can regulate the VEGF-A gene, have therapeutic and prophylactic utilities. For example, 
these proteins or nucleic acid encoding them can be administered to cells in culture, e.g. in 
vitro or ex vivo, or in a subject, e.g., in vivo 9 to treat, prevent, and/or diagnose a variety of 

10 disorders, such as cancers, particularly metastatic cancers, an inflammatory disorder, and 
other disorders associated with increased angiogenesis. 

As used herein, the term "treat" or ''treatment" is defined as the application or 

administration of an ag ent wh i ch en a bl e s a zin c finger protein to e nter a cell and reg ulate 

gene expression, to a subject, e.g., a patient, or application or administration of the agent to 

15 an isolated tissue or cell, e.g., cell line, from a subject, e.g., a patient, who has a disorder (e.g., 
a disorder as described herein), a symptom of a disorder or a predisposition toward a disorder, 
with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect 
the disorder, the symptoms of the disorder or the predisposition toward the disorder. 

In one embodiment, "treating a cell" or "treating a tissue" refers to a reduction in at 

20 least one activity of a cell, e.g., VEGF-A production, angiogenesis stimulation, proliferation, 
or other activity of a cell, e.g., a hyperproliferative cell or cell in or near a tissue, e.g., a tumor. 
Such reduction can include a reduction, e.g., a statistically significant reduction, in the 
activity of a cell or tissue (e.g., metastatic tissue) or the number of the cell or size of the 
tissue, the amount or degree of blood supply to the tissue. An example of a reduction in 

25 activity is a reduction in migration of the cell (e.g., migration through an extracellular matrix), 
a reduction in blood vessel formatin, or a reduction in cell differentiation. Another example 
is an activity that, directly or indirectly, reduces inflammation or an indicator of 
inflammation. 

As used herein, an amount of a zinc finger protein or a nucleic acid encoding it 
30 effective to treat a disorder, or a "therapeutically effective amount" refers to an amount of the 
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protein or nucleic acid which is effective, upon single or multiple dose administration to a 
subject, in treating a cell. 

As used herein, an amount of an zinc finger protein or a nucleic acid encoding it 
effective to prevent a disorder, or a "a prophylactically effective amounf * of the protein or 
5 nucleic acid refers to an amount of the protein or the nucleic acid encoding it, which is 
effective, upon single- or multiple-dose administration to the subject, in preventing or 
delaying the occurrence of the onset or recurrence of a disorder, e.g., a cancer, angiogenesis- 
based disorder, or inflammatory disorder. 

As used herein, the term "subject" is intended to include human and non-human 

1 0 anim als. Exemplary subjects include a human patient having a disorder characterized by 
abnormal cell proliferation or cell differentiation. The term "non-human animals" includes 
all non-human vertebrates, e.g., non-mammals (such as chickens, amphibians, reptiles) and 
mammals, such as non-human primates, sheep, dog, cow, pig, etc. 

In one embodiment, the subject is a human subject. In one embodiment, the ~ 

1 5 composition of a zinc finger protein or a nucleic acid encoding it can be administered to a 
non-human mammal (e.g., a primate, pig or mouse) for veterinary purposes or as an animal 
model of human disease. Regarding die latter, such animal models may be useful for 

eval uating the ther a peu tic effi c acy of the composition ( e .g., ^ tcsting of dosag e s and ti me 

courses of administration). 

20 I* one embodiment, the invention provides a method of treating a neoplastic disorder. 

The method can include the steps of contacting a cell of a subject with an zinc finger protein 
or a nucleic acid encoding it, e.g., a zinc finger protein that regulates VEGF-A or a nucleic 
acid encoding it, e.g., as described herein, in an amount sufficient to treat or prevent the 
neoplastic disorder. For example, the disorder can be caused by a cancerous cell, a tumor 

25 cell or a metastatic cell. The subject method can be used on cells in culture, e.g. in vitro or ex 
vivo. For example, cancerous or metastatic cells (e.g., renal, urothelial, colon, rectal, lung, 
breast, endometrial, ovarian, prostatic, or liver cancerous or metastatic cells) can be cultured 
in vitro in culture medium and the contacting step can be effected by adding the zinc finger 
protein or a nucleic acid encoding it to die culture medium. The method can be performed on 

30 cells (e.g., cancerous or metastatic cells) present in a subject (e.g., a human subject), as part 
of an in vivo (e.g., therapeutic or prophylactic) protocol. For in vivo embodiments, the 
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contacting step is effected in a subject and includes administering the zinc finger protein or a 
nucleic acid encoding it to the subject under conditions effective to permit regulation of the 
VEGF-A gene in cells of the subject. 

The method can be used to treat a cancer. As used herein, the terms "cancer", 
5 "hyperproliferative", "malignant", and "neoplastic" are used interchangeably, and refer to 
those cells an abnormal state or condition characterized by rapid proliferation or neoplasm. 
Hie terms include all types of cancerous growths or oncogenic processes, metastatic tissues 
or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or 
stage of invasiveness. 'Tathologic hyperproliferative" cells occur in disease states 
1 0 characterized by malignant tumor growth. 

The common medical meaning of the term "neoplasia" refers to "new cell growth" 
that results as a loss of responsiveness to normal growth controls, e.g. to neoplastic cell 
growth. A "hyperplasia" refers to cells undergoing an abnormally high rate of growth. 
However, as used herein, the terms neoplasia and hyperplasia can be used interchangeably, as 
15 their context will reveal, referring generally to cells experiencing abnormal cell growth rates. 
Neoplasias and hyperplasias include "tumors," which may be benign, premalignant or 
malignant 

Examples of cancerous disord e rs include, but are not limited to, solid tu mors, sufl 

tissue tumors, and metastatic lesions. Examples of solid tumors include malignancies, e.g., 

20 sarcomas, adenocarcinomas, and carcinomas, of the various organ systems, such as those 
affecting lung, breast, lymphoid, gastrointestinal (e.g., colon), and genitourinary tract (e.g., 
renal, urothelial cells), pharynx, prostate, ovary as well as adenocarcinomas which include 
malignancies such as most colon cancers, rectal cancer, renal-cell carcinoma, liver cancer, 
non-small cell carcinoma of the lung, cancer of the small intestine and so forth. Metastatic 

25 lesions of the aforementioned cancers also can be treated or prevented using a method or 
composition described herein. 

The subject method can be useful in treating malignancies of the various organ 
systems, such as those affecting lung, breast, lymphoid, gastrointestinal (e.g., colon), and 
genitourinary tract, prostate, ovary, pharynx, as well as adenocarcinomas which include 

30 malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or 
testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and 
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cancer of the esophagus. The term "carcinoma" is recognized by those skilled in the art and 
refers to malignancies of epithelial or endocrine tissues including respiratory system 
carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular 
carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and 
5 melanomas. Exemplary carcinomas include choriocarcinomas and those forming from tissue 
of the cervix, lung, prostate, breast, endometrium, head and neck, colon and ovary. The term 
also includes carcinosarcomas, e.g., which include malignant tumors composed of 
carcinomatous and sarcomatous tissues. An "adenocarcinoma" refers to a carcinoma derived 
from glandular tissue or in which the tumor cells form recognizable glandular structures. The 
10 term "sarcoma" is recognized by those skilled in the art and refers to malignant tumors of 
mesenchymal derivation. 

The subject method also can be used to inhibit the proliferation of 
hyperplastic/neoplastic cells of hematopoietic origin shown to express VEGF-A. 

Methods of administering zinc finger proteins or nucleic acids are described in 
15 "Pharmaceutical Compositions". Suitable dosages of the molecules used will depend on the 
age and weight of the subject and the particular drug used. 

A zinc finger protein or a nucleic acid encoding it can be coupled to label, e.g., for 

imaging in a subject after it is delivered to a subject Suitable labels mrhirte. MRJUdetectabk 

labels or radiolabels. 

20 A zinc finger protein or a nucleic acid encoding it described herein can be 

administered alone or in combination with one or more of the existing modalities for treating 
cancers, including, but not limited to: surgery; radiation therapy, and chemotherapy. 

A zinc finger protein or a nucleic acid encoding it, particularly one that can regulate 
(e.g., reducing expression of) the VEGF-A gene, can be administered alone or in combination 

25 with one or more of the existing modalities for treating an inflammatory disease or disorder. 
Exemplary inflammatory diseases or disorders include: acute and chronic immune and 
autoimmune pathologies, such as, but not limited to, rheumatoid arthritis (RA), juvenile 
chronic arthritis (JCA), psoriasis, graft versus host disease (GVHD), scleroderma, diabetes 
mellitus, allergy; asthma, acute or chronic immune disease associated with an allogenic 

30 transplantation, such as, but not limited to, renal transplantation, cardiac transplantation, bone 
marrow transplantation, liver transplantation, pancreatic transplantation, small intestine 
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transplantation, lung transplantation and skin transplantation; chronic inflammatory 
pathologies such as, but not limited to, sarcoidosis, chronic inflammatory bowel disease, 
ulcerative colitis, and Crohn's pathology or disease; vascular inflammatory pathologies, such 
as, but not limited to, disseminated intravascular coagulation, atherosclerosis, Kawasaki's 
5 pathology and vasculitis syndromes, such as, but not limited to, polyarteritis nodosa, 

Wegener's granulomatosis, Henoch-Schonlein purpura, giant cell arthritis and microscopic 
vasculitis of the kidneys; chronic active hepatitis; Sjogren's syndrome; psoriatic arthritis; 
enteropafhic arthritis; reactive arthritis and arthritis associated with inflammatory bowel 
disease; and uveitis. 

10 Inflammatory bowel diseases (TOD) include generally chronic, relapsing intestinal 

in fl ammation. IBD refers to two distinct disorders, Crohn's disease and ulcerative colitis 
(UC). The clinical symptoms of IBD include intermittent rectal bleeding, crampy abdominal 
pain, weight loss and diarrhea. A clinical index can also be used to monitor IBD such as the 
Clinical Activity Index for Ulcerative Colitis. See also, e.g., Walmsley et al. Gut, 1998 

15 Jul;43(l):29-32 and Jowett et al. (2003) ScandJ Gastroenterol. 38(2): 164-71. 

A zinc finger protein or a nucleic acid encoding it can be used to treat or prevent one 
of the foregoing diseases or disorders. For example, the protein can be administered (locally 

or systemically) in an amount effect i ve to a m e li orate at least one symptom of the r e spective 

disease or disorder. The protein may also ameliorate inflammation, e.g., an indicator of 

20 in flamm a tion, e.g., such as local temperature, swelling (e.g., as measured), redness, local or 
systemic white blood cell count, presence or absence of neutrophils, cytokine levels, and so 
forth. It is possible to evaluate a subject, e.g., prior, during, or after administration of the 
protein, for one or more of indicators of inflammation, e.g., an aforementioned indicator. 

A zinc finger protein or a nucleic acid encoding it, particularly one that can regulate 

25 (e.g., increase expression of) the VEGF-A gene, can be administered alone or in combination 
with one or more of the existing modalities for treating a wound, e.g., to promote wound 
healing. For example, generally, activation of VEGF-A can increase formation of new blood 
vessels and capillaries. The protein or nucleic acid can also be used for ameliorating surgery, 
burn, traumas, ulcers, bone fractures, and other disorders that require increased angiogenesis. 
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A zinc finger protein or a nucleic acid encoding it, particularly one that can regulate 
(e.g., increase expression of) the VEGF-A gene, can be administered alone or in combination 
with one or more of the existing modalities for treating a cardiovascular disorder, e.g., e.g., 
ischemic heart disease, peripheral artery disease, or coronary artery disease. 
5 The present invention will be described in more detail through the following practical 

examples. However, it should be noted that these examples are not intended to limit the 
scope of the present invention. 

Example 1: Gel shift assays 

10 This example provides a method of evaluating the DNA binding properties of zinc 

finger proteins in vitro. Zinc finger proteins were expressed in E. coli y purified, and used in 
gel shift assays. The DNA segments encoding zinc finger proteins were inserted into 
pGEX-4T2 (Pharmacia Biotech). These constructs were exp ressed in E. coli strain BL2 1 to 
produce fusion proteins that include the zinc finger proteins connected to GST (Glutathione- 
's S-transferase). The fusion proteins were purified using glutathione affinity chromatography 
(Pharmacia Biotech, Piscataway, NJ) and then digested with thrombin. Thrombin cleaves the 
linker sequence between the GST moiety and zinc finger proteins. 

Various amounts of a zinc finger protein were incubated with a radioactively labeled 
probe DNA for one hour at room temperature in 20 mM Tris pH 7.7, 120 mM NaCl, 5 mM 
20 MgCl 2 , 20 *iM ZnS0 4 , 10% glycerol, 0.1% Nonidet P-40, 5 mM DTT, and 0.10 mg/mL BSA 
(bovine serum albumin), and then the reaction mixtures were subjected to gel electrophoresis. 
Distribution of the probe in the gel was quantitated by PHOSPHORIMAGER™ analysis 
(Molecular Dynamics). Dissociation constants (K d ) were determined as described (Rebar and 
Pabo (1994) Science 263:671-673). 
25 We have previously found that zinc finger proteins that function in an in vivo yeast assay 

also have biochemical activity. In general, when a zinc finger protein, e.g., having three zinc 
finger domains, binds a DNA sequence with a dissociation constant lower than 1 nM, it 
allows cell growth in the one-hybrid yeast cell assay described in US 2002-0061512, whereas 
when a zinc finger protein binds a DNA sequence with a dissociation constant higher than 
30 1 nM, it does not allow cell growth in that assay. Zinc finger proteins that bind with a 
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dissociation constant of greater than 1 nM but less than 50 nM can also be useful. For 
example, additional fingers can be added to those zinc fingers to produce tighter or more 
specific binders. 

The in vitro assay can also be adapted to evaluate binding by an individual zinc finger 
5 domain to a particular three or four basepair site. In one implementation, the individual zinc 
finger domain is evaluated in the context of fingers 1 and 2 of Zi£Z68 and a target site that 
includes (i) basepairs recognized by fingers 1 and 2 and (ii) the particular three or four 
basepair site. 



10 Example 2: Construction of Individual Three-fingered Proteins 

This example provides a method for constructing a nucleic acid encoding a chimeric 
three-fingered protein, The vector P3 (Toolgen, Inc.) was used to express chimeric zinc 

finger proteins in mammalian cells. P3 was constructed by modification of the pcDNA3 

vector (Invitrogen, San Diego CA). A synthetic oligonucleotide duplex having compatible 

1 5 overhangs was ligated into the pcDNA3 vector digested with HindlH and Xhol. The duplex 
contains nucleic acid that encodes the hemagglutinin (HA) tag and a nuclear localization 
signal. The duplex also includes BamHI, EcoRI and NotI and Bglll restriction site sites and 
a stop codon. Further, the Xmal site in SV40 origin of the resulting vector was destroyed by 
digestion with Xmal, filling in the overhanging aids of the digested restriction site, and 

20 religation of the ends. 



The following is one exemplary method for constructing a plasmid that encodes a 
chimeric zinc finger protein with multiple zinc finger domains. First, an insert that encodes a 
single zinc finger domain was inserted into a vector (the P3 vector) that harbored a sequence 
25 encoding a single zinc finger domain. The result of this cloning is a plasmid that encodes a 
zinc finger protein with two zinc finger domains. A zinc finger domain insert consisting of 
two zinc finger domains was prepared by the above method and cloned into Agel/Notl- 
linearized vector P3 having one or two zinc finger domains to obtain a plasmid containing a 
zinc finger protein gene consisting of three or four zinc finger domains. 
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Genes encoding chimeric zinc finger proteins were then cloned into pre-prepared 
plasmids that encode a functional domain, e.g., p65 transcriptional activation domain, a Kid 
transcriptional repression domain, or a KOX transcriptional repression domain. The 
plasmids that include the genes encoding chimeric zinc finger proteins were digested with 

5 EcoRI/ NotI and ligated into plasmids linearized with the same enzymes. The cloning site in 
the acceptor plasmids (pLFD-p65, pLFD-KRAB, pLFD-KOX) placed the sequence encoding 
the zinc finger domains in a position that results in the DNA binding region being N-terminal 
to the functional domain. The resulting constructs encode a protein that includes, from N- to 
C- terminus: HA-tag, Nuclear localization signal, zinc finger protein and the functional 

10 domain. 

Example 3: In vivo Assays for Three-Fingered Proteins with Human Zinc Finger 
Domains 

An in vivo repression assay was used to determine if the new three-fingered proteins 

15 were functional in vivo. See, for example, Kim and Pabo ((1997) J Biol Chem 272:29795- 
29800). The assay utilized a luciferase reporter construct in which a target site is located at a 
position comparable to the position of the Zi£268 site in the construct of Kim and Pabo, supra. 

The luciferase reporter plasmids were constructed trom pAS-modi, a modified version 
of pGL3-TATA/Inr (Kim and Pabo, supra). These reporters utilize firefly luciferase as the 

20 reporter protein. The Sad site upstream of the TATA box was deleted from pAS-modi. 
A new SacI site was inserted following the transcription initiation site. Different reporter 
plasmids were made for each of the different zinc finger proteins. To construct each plasmid, 
an oligomer containing a given nine basepair binding site that is predicts to interact with a 
particular zinc finger protein was inserted into the plasmid. The plasmid pAS-modi was 

25 digested with SacI and Hindin, and the oligomer was inserted. This manipulation replaces 
14 base pairs at a position 12 basepairs downstream from the transcription initiation site. The 
resulting reporter plasmids were named plG-ZFP ID, wherein ID was the name of the 
corresponding zinc finger protein. 

The in vivo activity assay for a particular three-fingered protein was carried out as 

30 follows. HEK 293 cells were transfected with four plasmids: 1 4 ng of a plasmid expressing 
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the particular three-fingered protein; 14 ng of the reporter plasmid described above; 70 ng of 
a plasmid that expresses GAL4-VP16; and 1 .4 ng of a plasmid that expresses Renilla 
luciferase. The GAL4-VP16 activates transcription of the minimal synthetic promoter in the 
reporter absent repression by a particular three-fingered protein. The ability of different zinc 

5 finger proteins was compared to other three-fingered proteins. The plasmid expressing 
Renillar luciferase controlled for transfection efficiency. 

LIPOFECTAMINE™ (Gibco-BRL) was used for the transfection procedures. Cells 
were transfected at 30-50% confluency in wells of a 96 well plate. The cells were incubated 
for two days prior to harvesting for the luciferase assay. Then luciferase activities were 

10 measured using the DUAL-LUCIFERASE™ Reporter Assay System (Promega). The 
observed firefly luciferase activity was normalized using the observed level of Renilla 
luciferase. The extent of repression or "f old-repression" was calculated by dividing a value 
for normalized reporter expression in the absence of a zinc finger protein by a value for 
normalized reporter expression in the presence ofthe zinc finger protein. 

15 Zinc finger proteins were classified as satisfying a high stringency cut-off value if 

they repressed transcription at least 2-fold in the transfection assay or as satisfying a low 
stringency cut-off value if they repressed between 1.5 and 2-fold in the transfection assay. 



Example 4: Binding Assay Result of ZFPs with Their Specific Reporter 

20 Gel shift assays were used to correlate activity observed in the in vivo assays to 

binding affinity. The binding of Zif268 to different target sequences was evaluated using gel 
shift assays and the transfection assay described above in Example 3. A good correlation 
was observed between the dissociation constants measured by gel shift assays and the level 
of transcriptional repression in the transfection assays described above. In general, zinc 

25 finger proteins exhibiting more than 2-fold repression (that is, 50% repression) in the 

transfection assays showed a dissociation constant of less than 1 nM as determined by gel 
shift assays. 
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Example 5: Characterization of Three-fingered Proteins 

Two types of "three-finger" chimeric zinc finger proteins were constructed. One 
type includes chimeric proteins that are composed exclusively of wild-type human zinc 
finger domains, i.e., domains that are identical to naturally-occurring human zinc finger 
5 domains. The other type includes chimeric proteins that include zinc finger domains that are 
not identical to a naturally-occurring zinc finger domain. The latter zinc finger domains were 
typically identified by in vitro mutagenesis of a naturally-occurring zinc finger domain 
followed by phage display selection. Such mutant domains have avoided the scrutiny of 
natural evolution. 

10 A total of 36 zinc finger domains, 1 8 human zinc finger domains and 1 8 mutated 

zinc finger domains, were vised to assemble a set of test three-fingered proteins. The mutated 
zinc finger domains have been reported in Choo and Klug (1994) Proc. Natl Acad, Sci. USA 
91:11168-11172; Desjarlais and Berg (1994) Prac. Natl Acad, Set USA. 91:11099-11103; 
Ureier et aL (2001) J Biol Chem. 276:29466-29478; Dreier etal. (2000) J Mol Biol 303:489- 

15 502; Fairall et al (1993) Nature 366:483-487; Greisman and Pabo (1997) Science. 275:657- 
661; Kim and Pabo (1997) J. Biol Chem. 272:29795-29800; and Segal et al. (1999) Proc. 
Natl Acad. Set USA 96:2758-2763. See also Table 9 of US 2003-165997. Nucleic acids 

encoding th e 36 domains w er e individually subcloned into P3 vecto r digested with EcoRl" " 

and Notl, and the resulting plasmids were used as starting material for the chimeric zinc 

20 finger protein construction. 

Nucleic acids encoding chimeric three-fingered proteins were prepared by two 
different methods. In the first method, nucleic acids encoding all the zinc finger domains 
were randomly mixed, and three-fingered constructs were randomly picked for further 
analysis. Each construct was sequenced to determine the component zinc finger domains in 

25 the polypeptide that it encodes. Subsequently, target DNA sequences were synthesized for 
each randomly assorted three-fingered protein. Target DNA sequences were based on the 
expected preferred target site. The targets were cloned into the luciferase reporter vector 
described above. This approach is referred to as "zinc finger protein-first" approach. 

In the second method, nucleic acid encoding chimeric three-fingered proteins were 

30 assembled based on a given target DNA sequences. A computer algorithm was used to 
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match recognition sites of zinc finger domains and target DNA sequences. Promoter 
sequences of known genes were used as the input target DNA sequences. The promoter 
sequences were scanned to identify segments that are nine nucleotides in length and that are 
acceptable target sites for recognition by chimeric three-fingered proteins given the available 
5 collection of zinc finger domains. Once identified, a nucleic acid was constructed that 
encoded the chimeric three-fingered proteins. This approach is referred to as "target site- 
first" approach. 

Zinc finger domains that include an aspartate residue at position 2 of the base 
contacting residues were analyzed with special consideration. Such zinc finger domains 

10 include RDER1 , RDHT, RDNR, RDKR, RDTN, TDKR, and NDTR. The X-ray co-crystal 
structure of Zif268 bound to DNA showed that an aspartate at position 2 can form a 
hydrogen bond with a base outside of the 3-basepair subsite recognized by zinc fingers. As a 
result, the RDER finger containing an aspartate residue at position 2 prefers the 4-basepair 
site = 5' - GCG (G/T) - 3\ The compulei algo r ithm accounted for this addi t ional specificity. 

15 Randomly-assembled three-fingered proteins that include a finger having aspartate at 
position 2 and that violate this rule for the 4-bp site were excluded in other analyses 
described herein. 

A total of 153 three-fingered proteins were constructe d fro m both the "zino fingc i 

protein-firsf * and the "target site-first" approaches. These proteins were tested using the 

20 transient cotransfection assay described in Example 3. 

3 1 of 153 chimeric zinc finger proteins demonstrated greater than 2-fold repression, 
the high stringency criterion (RF>2;RF = fold repression). Of the proteins constructed 
entirely from naturally-occurring human zinc finger domains, 28.1% (27 of 96) exceeded the 
high stringency criterion and 59.4% exceeded the low stringency criterion (RF > 1.5). Of the 

25 proteins constructed from two naturally-occurring zinc finger domains and one mutated 
domain, 33.3% exceeded the high stringency criterion, and only 20% exceeded the low 
stringency criterion. 

In contrast, of the 17 proteins constructed from one human domain and two mutated 
domains, only one protein (5.9%) exceeded the high stringency criterion, and only two 
30 proteins (1 1.8%) exceeded the low stringency criterion. Strikingly, no zinc finger proteins 
composed exclusively of mutated domains satisfied the high stringency criterion in the 
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repression assay. Only one such protein (4%) satisfied the low stringency criterion. These 
results indicate that naturally-occurring human zinc finger domains are frequently better 
building blocks for the construction of new DNA-binding proteins than mutated domains. 

5 Example 6: Designed Chimeric Zinc Finger Proteins that bind to the VEGF-A gene 

In this example, we designed chimeric zinc finger proteins that bind to DNA elements 
in the human vascular endothelial growth factor A (VEGF-A) gene. The -950 to +450 region 
of the VEGF-A promoter was scanned to identify nine nucleotide sites that are compatible 
for recognition by available combinations of zinc finger domains in a three-fingered 
10 configuration. 

We constructed several DNA constructs encoding zinc finger proteins that include 
domains designed to recognize such nine nucleotide sites. The proteins were expressed in 
& co tt and pur ified. We evaluated their DNA binding specificity usin g a SELEX 
(Systematic Evolution of Ligands by Exponential enrichment) experiment Many zinc finger 

15 proteins that were designed to target the VEGF-A promoter demonstrated the expected DNA- 
binding specificities. Nearly all of the consensus sequences obtained from the SELEX 
analyses were identical to the intended target sequences in the VEGF-A gene. One 
exemplary zinc finger protein, termed F121, showed a consensus sequence that differed from 
the intended target sequence by one base at a position where the corresponding zinc finger 

20 shows degeneracy in base recognition. 

Transcription factors that include a DNA binding domain with these artificial zinc 
fingers were generated by fusing nucleic acids encoding the three zinc finger domains to a 
nucleic acid encoding either the p65 or VP16 activation domain. The resulting nucleic acid 
was inserted into an expression plasmid. 

25 FIG. 4 shows the locations of the DNA binding sites in the VEGF-A promoter that 

are recognized by these chimeric zinc finger proteins. The human VEGF-A promoter 
contains at least two DNase I-hypersensitive regions. The binding of engineered zinc finger 
proteins transcription factors to these sites can activate VEGF-A gene expression. F480 was 
designed to recognize a site at about -633R ("R" designates the reverse strand). F475 was 

30 designed to recognize a site at about -455. F435 was designed to recognize a site at about- 
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391R and a site at about -90R. F83 was designed to recognize a site at about +359. F121 

was designed to recognize a site at about +434. 

We found that regardless of the location of the binding sites, four zinc finger proteins 

(F480, F475, F121, and F435) that we tested activated not only a luciferase reporter gene 
5 under the control of the VEGF-A promoter, but also the endogenous VEGF-A gene itself. 

An ELIS A on media from the transiently transfected cells indicated that these chimeric zinc 

finger proteins also up-regulated production of the VEGF-A protein 13- to 21-fold. 
When two of the chimeric zinc finger proteins, F435 and F121, were fused 

individually to the KRAB repression domain, they each actively repressed VEGF-A 
10 expression in HEK 293 cell. Control cells that had been transfected with the parental 

expression vector (which contained no zinc finger protein coding sequences) did not show 

any increase or decrease in VEGF-A mRNA or protein levels. 

The protein F83 did not show any effect on the levels of VEGF-A mRNA or protein 

in these assays). This may be due to t he binding uf some otherprotein to the target site or to 
15 the local chromatin structure, which might have rendered the target DNA inaccessible to the 

zinc finger protein. There was no absolute correlation between the levels of VEGF-A 

expression by these zinc finger proteins and their DNA-binding affinities or their expression 

To investigate the specificity of zinc finger proteins on a genome-wide scale, we 
20 performed DNA microarray experiments with 293 cell lines that had been stably transfected 
with DNA constructs that encode one of the following three zinc finger transcription factors: 
F121-p65, F435-p65, and F475-VP16. Fifty-one of 7458 genes were regulated by all three 
zinc finger transcriptional activator proteins. Forty-nine were up-regulated more than two- 
fold, and two were down-regulated more than two-fold. Most of these co-regulated genes 
25 appear to be closely associated with VEGF-A function. Many of them are regulated by 
VEGF-A, involved in angiogenesis or hypoxia, or expressed in vascular endothelial cells. 
Therefore, it is likely that these genes are downstream targets of VEGF-A expression. In 
addition, numerous other genes were regulated by one or two of the zinc finger protein 
activators but not by all three tested proteins. Since these zinc finger proteins recognize nine 
30 basepairs site, it is possible that these zinc finger proteins directly regulate genes other than 
VEGF-A, e.g., by binding to identical or related target sites in other genes. Construction of 
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four, five, or six-fingered proteins may improve specificity. Taken together, these data 
indicate that the described zinc finger proteins, which were assembled by shuffling naturally- 
occurring zinc finger domains, function in cells as transcriptional regulators of specific 
endogenous genes. 

For example, a protein described herein may regulate one or more of the following 
genes: jun B proto-oncogene (N94468), EphA2 (H84481), EphB4 (AI261660), fibroblast . 
growth factor receptor 3 (achondroplasia, thanatophoric dwarfism) (AA419620), FK506- 
binding protein 8 (38kD) (N95418), protein kinase C, zeta (AA458993), v-erb-b2 avian 
erythroblastic leukemia viral oncogene homolog 3 (AA664212), lectin, galactoside-binding, 
soluble, 1 (galectin 1) (AI927284), protein phosphatase 2, regulatory subunit B (B56), alpha 
isoform (R59165), insulin-like growth factor 2 (somatomedin A) (N54596), plectin 1, 
intermediate filament binding protein, 500kD (AA448400), Periplakin (AT703487), choline 
kinase (H09959), collagen, type VI, alpha 1 (H99676), adaptor-related protein complex 1, 
sig m a 1 sub i mil (W44558), a ii es li n, be t a 2 (AW009594), GATA-binding protein 2 (H00625); 
15 cyclin-dependent kinase inhibitor 1A (p21, Cipl) (AI952615), mitogen-activated protein 

kinase kinase kinase 1 1 (R80779), acetylcholinesterase (YT blood group) (AI360141), brain- 
specific Na-dependent inorganic phosphate cotransporter (AA702627), cellular retinoic acid- 

bin di n g prot e in 1 (A A 4 5 4 702), cellular retinoic acid - binding protein 2 ( AA59850 8 ), cadherin - 

13, H-cadherin (heart) (R41787), calcium channel, voltage-dependent, beta 3 subunit 
20 (R36947), carbonic anhydrase XI (N52089), troponin Tl, skeletal, slow (AA868929), 
gamma-aminobutyric acid (GABA) B receptor, 1 (N70841), adenylate cyclase activating 
polypeptide 1 (pituitary) receptor type I (H09078), solute carrier family 4, anion exchanger, 
member 2 (erythrocyte membrane protein band 3-like 1) (W45518), glypican 1 (AA455896), 
protein C inhibitor (plasminogen activator inhibitor HI) (W86431), cyclin-dependent kinase 
25 inhibitor 1C (p57, Kip2) (AI828088), zinc finger protein 43 (HTF6) (AA773894), zinc finger 
protein homologous to Zip-36 in mouse (R38383), Meis (mouse) homolog 3 (AA703449), 
SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily d, 
member 3 (AA053810), 0, unknown (Rl 1526), unknown (AA045731), unknown (T51849), 
unknown (T50498), putative gene product (H091 1 1), B/K protein (H23265), damage-specific 
30 DNA binding protein 2 (48kD) (AA410404), dihydropyrimidinase-like 4 (AA757754), N- 
methylpurine-DNA glycosylase (N26769), protein tyrosine phosphatase, receptor type, N 
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(R45941), fasciculation and elongation protein zeta 1 (zygin I) (H20759), lanosterol synthase 
(2,3-oxidosqualene-lanosterol cyclase) (AA437389), 0> 0> spennidine/spermineNl- 
acetyltransferase (AA01 1215), and a disintegrin-like and metalloprotease (reprolysin type) 
with thrombospondin type 1 motif; 1 (T41 173). The expression of these protein or the genes 
5 that encode them can be regulated at least 0.5, 1 .0, 2, 5, 10, or 50-fold, e.g., between 2 and 
80-fold. 



Exemplary sites in the VEGF promoter and proteins that can recognize them include: 

Table 7: VEGF-A Promoter Sites (A) 

Protein Site Sequenc e ~ ~~ 

F475 -455 GAG CGG GGA 

F121 +434 TGG GGG TGA 

F435 -9 OR GGG CGG GGA 

F547 -665 AAT AGG GGG 

F2825 +^T4 TGG GGG TGA ~~ 



Table 8: VEGF-A Promoter Sites (B) 



Protein Site Sequence 



F4 8 0 - -633R GGG TGG GGG 



F435 -391R GGG TGG GGA 

F2828 +435 GGG GGT GAC 

F625 +435 GGG GGT GAC 

F2830 +435 GGG GGT GAC 

F2838 +435 GGG GGT GAC 
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Table 9: VEGF-A Promoter Sites (Q 



Protein Site Sequence SEQ ID 

NO: 



F2604 


-680 


GTT 


TGG 


GAG 


GTC 


76 


F2605 


-677 


TGG 


GAG 


GTC 


AGA 


77 


F2607 


-671 


GTC 


AGA 


AAT 


AGG 


78 


F2615 


-606 


GCC 


AGA 


GCC 


GGG 


79 


F2633 


-455 


GAG 


CGG 


GGA 


GAA 


80 


F2634 


-395R 


GGG 


GAG 


AGG 


GAC 


81 


F2636 


-393R 


GTG 


GGG 


AGA 


GGG 


82 


F2644 


-358R 


GGG 


GCA 


GGG 


GAA 


83 


F2646 


-314R 


GAC 


AGG 


GCC 


TGA 


84 


F2650 


-206 


GGT 


GGG 


GGT 


CGA 


85 


F2679 


+244R 


CAA 


GTG 


GGG 


AAT 


86 



Table 10: VEGF-A Promoter Sites (D) 



Protein 


Site 


Seouence 




SEO 


ID 


NO: 


F2610 


-633R 


GGG TGG GGG 


GAG 




87 


F2612 


-630R 


AGG GGG TGG 


GGG 




88 


F2638 


-391R 


GGG TGG GGA 


GAG 




89 



5 Table 11: VEGF-A Promoter Sites (E) 

Protein Site Sequence SEQ ID 



. NO: 



F109 


536B 


GAG 


CGA 


GCA 


GCG 


90 


F2608 


-668 


AGA 


AAT 


AGG 


GGG 


91 


F2611 


-631R 


GGG 


GGT 


GGG 


GGG 


92 


F2617 


-603 


AGA 


GCC 


GGG 


GTG 


93 


F2619 


-554 


AGG 


GAA 


GCT 


GGG 


94 


F2623 


-495 


GTG 


GGT 


GAG 


TGA 


95 


F2625 


-475 


GTG 


TGG 


GGT 


TGA 


96 


F2 628 


-468 


GTT 


GAG 


GGT 


GTT 


97 


F2629 


-465 


GAG 


GGT 


GTT 


GGA 


98 


F2630 


-4 62 


GGT 


GTT 


GGA 


GCG 


99 


F2634 


-395R 


GGG 


GAG 


AGG 


GAC 


100 


F2635 


-394R 


TGG 


GGA 


GAG 


GGA 


101 


F2637 


-392R 


GGT 


GGG 


GAG 


AGG 


102 


F2642 


-385R 


AGG 


GAC 


GGG 


TGG 


103 
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Protein Site Sequence SEQ ID 
NO: 



F2643 




HAP 


nbu 


fZZIP 
brib 


bbb 


i r\ a 
XUft 


F2648 








bbn 


bbfi 


1 f\ c 

J-UO 


F2651 


Z. U _J 


bob 


bbi 


bb/i 


prim 
bbl 


106 


F2653 




bAfi 


bbb 




bbi 


107 


F2654 


-181R 


AAT 


GAA 


GGG 


GAA 


108 


F2662 


-124R 


GCG 


GCT 


CGG 


GCC 


109 


F2667 


-85 


GGG 


CGG 


GCC 


GGG 


110 


F2668 


-3 OR 


AAA 


AAA 


GGG 


GGG 


111 


F2673 


+77 


GCA 


GCG 


GTT 


AGG 


112 


F2682 


+283R 


GGG 


GAA 


GTA 


GAG 


113 


F2689 


+342 


AGA 


GAA 


GTC 


GAG 


114 


F2697 


+357 


GAG 


AGA 


GAC 


GGG 


115 


F2699 


+366 


GGG 


GTC 


AGA 


GAG 


116 


F2703 


-632R 


GGG 


GTG 


GGG 


GGA 


117 


F2702 


+474R 


CAA 


GGG 


GGA 


GGG 


118 



Construction of a yeast expression plasmid for a zinc finger library 

We constructed an expression plasmid encoding a zinc finger transcription factor by 

• — 5 modification of p£C86 (Chevray an d Na th ans (1992) Proc. Natl. Acad. Sci. USA 89:5789 

5793). A gene encoding the Zif268 zinc finger protein was inserted between the San and 
EcoRI sites of pPC86 to generate pPCFM-Zif, in which the Gal4 activation domain is fused 
to the Zif268 domain. pPCFM-Zif was used as a vector for constructing libraries of zinc 
fingers. To construct human zinc finger libraries, DNA segments encoding zinc fingers were 

1 0 amplified from human genomic DNA using the polymerase chain reaction (PCR) (Promega, 
Madison, WI) and mixtures of degenerate PCR primers with the sequence His-Thr-Gly- 
Glu/Gln-Lys/Arg-Pro-Tyr/Phe, which is frequently found at the junction between zinc fingers 
in naturally-occurring zinc finger proteins. The 100-bp PCR products encoding the zinc 
fingers were digested with SacU and Aval and inserted into pPCFM-Zif, which encodes 

15 hybrid transcription factors consisting of finger 1 and finger 2 of Zif268 and a zinc finger 
domain derived from the human genome (together forming three-fingered protein). The 
plasmid library was prepared from a total of 1.2 x 10 6 E. coli transformants. 
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Reporter plasmids were prepared by inserting one of 64 pairs of complementary 
oligonucleotides that contained three copies of a 9-bp target sequence into pRS3 15(His) and 
pLacZi (Clontech, Palo Alto, CA). 

5 Gap repair cloning of human zinc finger domains selected from the human genome 

Gap repair cloning of DNA sequences that encode individual zinc finger domains was 
carried out as described (Hudson et al. (1997) Genome Res. 7:1 169-1 173). To clone a DNA 
segment that encode a zinc finger, two overlapping oligonucleotides were synthesized. Each 
oligonucleotide included a 21 -bp common tail at its 3' end for a second round of PCR as well 

10 as a specific sequence that can anneal to the nucleic acid sequence that encodes the individual 
zinc finger domain. DNA sequences encoding zinc fingers were amplified from human 
genomic DNA with an equimolar mixture of two corresponding oligonucleotides. 

Amplification products from t h e i ni ti al round of PCR were used as templates in a 

second round of PCR. The primers for the second round of PCR had two regions, one 

1 5 identical to a segment of pPCFM-Zif and another identical to the 2 1-bp common tail. A 

mixture of the second-round PCR products and linearized pPCFM-Zif that had been digested 
with MscI and EcoKL were transformed into the yWl flVLATa Agal4 Agal80 lys2801 his3- 
A200 trpl-A63 leu2 ade2-101CYHZ) yeast strain. A total of 823 human zinc fingers were 
cloned by this method. Many were used in our in vivo selection systems described herein. 

20 

In vivo selection of zinc finger domains 

Yeast mating was used to facilitate identification of zinc fingers that bind to each 
three basepair target site. The zinc finger library was introduced into the yWl (MATa) 
strain, and ~1 .47 x 10 6 independent transformed yeast colonies were generated. Aliquots of 
25 these transformed cells were mated for 5 h at 30°C with the haploid yeast strain yWla 

(MATa), which contained the 64 reporter plasmids in each of two sets (one for each of the 
reporter genes). The reporter plasmids contained three copies of the target DNA sequences 
adjacent to the coding regions of either the LacZ or HIS3 genes. The resulting diploids were 
plated on selective media that contained X-gal (40 ng/ml) and 3-amino triazole (3-AT) 
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(1 mM) but lacked histidine. Plasmids isolated from blue (positive) colonies were re- 
transformed to confirm the results and sequenced to identify their encoding zinc finger 
domains. The binding affinity and specificity of each zinc finger fused to fingers 1 and 2 of 
Zif268 were determined both in yeast and by EMSA. These methods are described below. 

5 

Construction of three-fingered proteins using selected zinc fingers as modular building 
blocks 

A modified version of the pcDNA3 (Invitrogen, Carlsbad, CA) vector (P3) was used 
as a parental vector for expressing zinc finger proteins in mammalian cells. P3 contains an 

1 o HA tag and a nuclear localization signal, both of which were inserted 3 9 to the initiation 
codon. DNA segments that encode individual zinc finger domains were subcloned into the 
P3 vector between the EcdRl and Notl sites, and the resulting plasmids were used as starting 

material for chimeric zin c finger protein con struction. New thrftft-finggrftH proteins were 

prepared by two different methods. In the first method, all the zinc fingers were mixed, and 

15 assembled three-fingered constructs were randomly chosen for further analysis. In the 

second method, new three-fingered proteins were designed to target specific DNA sequences. 
To this end, we used a simple computer algorithm that finds a match between recognition 
sites of zinc fingers and target DNA sequences. We used promoter sequences of known 
genes as the input DNA sequences and identified three-fingered proteins that should bind to 

20 nine basepair DNA elements within the input sequences. 

Zinc finger proteins that target the VEGF-A gene were constructed by this method. 
The constructed zinc finger proteins were tested for their DNA binding ability and affinity in 
mammalian cells as described previously. Kim and Pabo (1997) J. Biol Chem. 272, 29795- 
29800; Kim and Pabo (1998) Proc. Nad. Acad ScL USA 95, 2812-2817; and Kang and Kim 

25 (2000) J. Biol Chem. 275:8742-8748. The reporter plasmid for the assay was constructed 
using pGL3-TATA/Inr which harbors the firefly luciferase gene as the reporter. 

To connect functional domains to the zinc finger proteins, the transcriptional 
activation domain of p65 (amino acids 288-548) and VP16 (amino acids 413-490) were 
amplified by PCR using pairs of specific oligomers, and the PCR products for p65 and VP 16 

30 were cloned separately into P3 to generate pLFD-p65 and pLFD-VP16, respectively. 
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Nucleic acids that encode zinc finger proteins that target the VEGF-A promoter were inserted 
into pLFD-p65 or VP 16 to express zinc finger protein-activation domain (AD) fusions 
proteins (ZFP-AD). Real-time PCR, ELISA, and microarray analyses were carried out to 
determine whether these ZFP-ADs activate the VEGF-A gene. In addition, SELEX was 
5 performed to test whether these proteins recognize the appropriate target DNA sequences. 
See below. 

Binding affinity and specificity of human zinc finger domains 

Plasmids isolated from blue yeast colonies (see section entitled *7» vivo selection of 

10 zinc finger domains") were individually retransformed into yWl cells. For each isolated 
plasmid, re-transformed yWl cells were mated to yWla cells that contained each of the 64 
LacZ reporter plasmids. The resulting cells were then spread onto minimal media that 

_ contained X-gal and histidine but lacked tryptophan and uracil. Using the GEL-DOC™ 

system (Bio-Rad, Hercules, CA), we measured the intensity of the blue color for each colony 

1 5 to determine the DNA-binding affinities and specificities of each of the zinc finger domains 
that were fused to fingers 1 and 2 of Zif268. Control experiments with the Zif268 protein 
indicated that positive interactions between a zinc finger domain and a target binding site in 
the promoter of the LacZ reporter yielded dark to pale blue colonies (the blue intensity is 
proportional to the binding affinity) and that negative interactions yielded white colonies. 

20 

Electrophoretic mobility shift assay (EMSA) 

DNA segments that encode zinc finger proteins were isolated by digestion with Sail 
and Noil, and were inserted into pGEX-4T2 (Amersham Pharmacia, Uppsala, Sweden). Zinc 
finger proteins were expressed in E. coli strain BL21(DE3) as fusion proteins linked to 
25 glutathione-S-transferase (GST). The fusion proteins were purified using glutathione affinity 
chromatography (Amersham Pharmacia) and then digested with thrombin. This cleavage 
event severs the connection between the GST moiety and the zinc finger proteins. In this 
case, purified zinc finger proteins contained fingers 1 and 2 of Zif268 fused to selected zinc 
fingers in position 3 at the C-terminus. Probe DNAs were synthesized, annealed, labeled 
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with 32 P using T4 polynucleotide kinase, and EMS As were carried out as described. Kim and 
Pabo (1997) J. Biol. Chem. 272, 29795-29800 and Kim and Pabo (1998) Proc. Natl. Acad. 
Sci. USA 95, 2812-2817. The same procedure can be used to test other zinc finger proteins. 

5 Transcriptional regulation of endogenous VEGF 

Human embryonic kidney 293 cells were maintained in Dulbecco's modified Eagle 
medium (DMEM) supplemented with 100 units/ml penicillin, 100 ug/ml streptomycin, and 
10% fetal bovine serum (FBS). For the hiciferase assay, 10 4 cells/well were pre-cultured in a 
96-well plate. Using a LIPOFECTAMINE™ transfection kit (Life Technologies, Rockville, 
10 MD), 293 cells were transfected with 25 ng of a reporter plasmid in which the native VEGF- 
A promoter was fused to the luciferase gene in pGL3-basic (Promega), and 25 ng of a 
plasmid encoding a zinc finger protein. After 48 h of incubation, luciferase activity was 

- measured with a DUA L LUCIFERASE™ as s ay Irit (Promega) using a TD-20/20 

luminometer (Turner Designs Inc., Sunnyvale, CA). 
15 For reverse transcriptase-PCR (RT-PCR) analyses and ELISA 10 s cells/well were 

pre-cultured in 1 ml of culture medium (supplemented with 10% FBS but deprived of 

antibiotics) in a 12-well culture plate for 24 h at 37°C in a humid atmosphe re containin g 

5% C0 2 . The cells were then transfected with DNA using a LIPOFECTAMINE™ 
transfection kit (Life Technologies). Briefly, 1 ng of a plasmid encoding a zinc finger 
20 protein was added to 5 ul plus reagent in a total of 50 ul DMEM, and mis solution was then 
mixed with another 50 ul of DMEM containing 2 ul of LIPOFECTAMINE™ reagent. After 
15 min of incubation, the entire 100 ul mixtures were added to cells in a culture plate, and the 
cells were grown for an additional 48 h. The cells and culture supernatants were harvested 
for RT-PCR analysis and ELISA. 

25 

Quantitative RT-PCR 

Total cellular RNA was extracted from TRIZOL™-lysates according to the 
manufacturer's instructions (Life Technologies). The reverse transcription reactions were 
performed with 4 ug total RNA using oligo-dT as the first-strand synthesis primer for mRNA 
30 and the MMLV reverse transcriptase provided in the SUPERSCRIPT™ first-strand synthesis 
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system (Life Technologies). To analyze mRNA quantities, 1 of the first-strand cDNAs 
generated from the RT reactions were amplified using VEGF-A-specific primers. The initial 
amounts of RNA were normalized to glyceraldehydes-3-phosphate dehydrogenase (GAPDH) 
mRNA concentrations that had been calculated by specific amplification using GAPDH- 
5 specific primers. The amplification of VEGF-A and GAPDH- specific cDNAs was 

monitored and analyzed in real-time with a QUANTITECT SYBR™ kit (QIAGEN, Valencia, 
CA) and ROTORGENE™ 2000 real- time cycler (Corbett, Sydney, Australia) and was 
quantified using serial dilution of the standards included in the reactions. 

10 ELISA 

The kidney 293 cell culture supernatants were briefly centrifuged for 5 min to remove 
cells and cell debris. VEGF protein that accumulated in the culture medium (lOOul each) and 
dilutions of a recombinant human VEGF-A protein standards were analyzed using sandwich 

ELISA (enzyme linked i mmuno sorbent assay), wherein the supernatant of culture was 

15 reacted with a anti-human VEGF antibody (R&D systems; AF-293-NA), biotinylated anti- 
human VEGF antibody (R&D systems; BAF293), streptavidin alkaline phosphatase. The 
antigen-antibody complex was reacted with pNPP (p-Nitrophenyl phosphate) dissolved in 

pNPP buffer (Chemicon; ES01 1). VEGF - A concentrations in the samples were dete r m ined — 

from the absorbance at 405 nm which was measured with POWERWAVE™ X340(Bio-TEK 

20 Instrument Inc., Winooski VT). 

DNA microarray analysis of FlpTRex-293 cell lines stably expressing zinc finger proteins 

Plasmids encoding ZFPs designed to target the VEGF-A promoter were stably 
introduced into FlpTRex-293 cell lines (Invitrogen) essentially as described in the 
25 manufacturer's protocol. Briefly, the HindHl-Xhol fragment from a pLFD-p65 or a pLFD- 
VP16 vector that contained DNA segments encoding zinc finger proteins was subcloned into 
pCDNA5/FRT/TO (Invitrogen). The resulting plasmids were cotransfected with pOG44 
(Invitrogen) into FlpTRex-293 cells, and stable integrants were screened. The resulting cell 
lines express ZFP-p65 or ZFP-VP16 upon doxycycline induction. 
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DNA microarrays containing 7458 human expressed sequence tag (EST) clones were 
provided by Genomic Tree, Inc. (Taejon, Korea). FlpTRex-293 cells stably expressing ZFP- 
p65 or ZFP-VP16 were grown with (+Dox) or without (-Dox) 1 jig/ml Doxycycline for 48 h. 
Total RNA was prepared from each sample. RNA from a -Dox sample was used as the 
5 reference (Cy3). Microarray experiments were performed according to the manufacturer's 
protocol. 

SELEX of assembled zinc finger proteins 

A template oligonucleotide was designed to contain a random 20-nucleotide region 
1 0 flanked, on both sides, by invariant sequences. In addition, two primers that were 

complementary to the invariant regions of the template oligonucleotide were designed for the 
PCR amplification. The template oligonucleotide was converted to double-stranded DNA by 
Klenow fragment extension from one of the primers. For enrichment of the target sequences 
bound by zinc finger proteins, 100 jig of me GST-fusion proteins was mixed wife 10 pmol of 
15 double-stranded template DNA in 100 \i\ of binding buffer (25 mM Hepes pH 7.9, 40 mM 
KC1, 3 mM MgCl 2 , 1 mM DTT) for one hour at room temperature. GST-resin (10 \d) was 
then added to the mixture. After incubation for 30 min at room temperature, the resin was 

washed three times with binding buffer containing 2.5 % skim milk. 

The bound double-stranded template oligomers were dissociated by incubating the 
20 resins with 100 ^il of 1 M KC1 for 10 min at room temperature. After PCR amplification of 
the rescued double-stranded template oligomers, a new round of SELEX was repeated. This 
process was repeated eight times. The final PCR product was digested with Xbal and BamYH 
and inserted into pBLUESCRIPT™ KS digested with the same enzymes. The DNA 
sequences of at least eight individual inserts per zinc finger protein ware determined. 

25 

Example 7: Sequences of Exemplary Proteins 

The following are the amino acid sequences of the DNA binding regions of 
exemplary proteins that can regulate VEGF-A: 
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Table 12: Amino Acid Sequences of DNA Binding Domains of Exemplary Proteins 

Name Amino Acid Sequence 



SEQ ID 
NO: 



.F475 
F121 
F435 
F547 
F2825 
F480 
F2828 



YKCGQCGKFY 
SDHLKTHTRT 

YKCEECGKAF 
RSHLTRHQRI 

YKCGQCGKFY 
SDHLKTHTRT 

YKCMECGKAF 
SDHLKTHTRT 

YECDHCGKSF 
KDHLTRHMKK 

YKCMECGKAF 
SDHLKTHTRT 

YKCKQCGKAF 
PSNLTRHKRI 



SQVSHLTRHQ 
HTGEKPYICR 

RQSSHLTTHK 
HTGEKPFQCK 

SQVSHLTRHQ 
HTGEKPYKCM 

NRRSHLTRHQ 
HTGEKPYECD 

SQSSHLNVHK 
SHTGEKPFQC 

NRRSHLTRHQ 
HTGEKPYKCM 

GCPSNLRRHG 
HTGEKPFLCQ 



KIHTGEKPFQ CKTCQRKFSR 20 
KCGRGFSRKS NLIRHQRTHT GEK 

IIHTGEKPYK CMECGKAFNR 21 
TCQRKFSRSD HLKTHTRTHT GEK 

KIHTGEKPFQ CKTCQRKFSR 22 
ECGKAFNRRS HLTRHQRIHT GEK 

RIHTGEKPFQ CKTCQRKFSR 23 
HCGKAFSVSS NLNVHRRIHT GEK 

RTHTGEKPFL CQYCAQRFGR 24 
KTCQRKFSRS DHLKTHTRTH TGEK 

RIHTGEKPFQ CKTCQRKFSR 25 
ECGKAFNRRS HLTRHQRIHT GEK 

RTHTGEKPYR CEECGKAFRW 26 
YCAQRFGRKD HLTRHMKKSH TGEK 



F625 
F2830 



YKCKQCGKAF 
PSNLTRHKRI 

YRCKYCDRSF 
WPSNLTRHKR 
HTGEK 



GCPSNLRRHG 
HTGEKPYKCM 

SDSSNLQRHV 
IHTGEKPFLC 



RTHTGEKPYR CEECGKAFRW 
ECGKAFNRRS HLTRHQRIHT 

RNIHTGEKPY RCEECGKAFR 
QYCAQRFGRK DHLTRHMKKS 



GEK 



27 
28 



F2838 YRCKYCDRSF SDSSNLQRHV RNIHTGEKPY RCEECGKAFR 29 
WPSNLTRHKR IHTGEKPYKC MECGKAFNRR SHLTRHQRIH TGEK 

F2604 YSCGICGKSF SDSSAKRRHC ILHTGEKPYI CRKCGRGFSR 30 
KSNLIRHQRT HTGEKPFQCK TCQRKFSRSD HLKTHTRTHT 
GEKPYTCKQC GKAFSVSSSL RRHETTHTGE K 

F2605 YKCEECGKAF RQSSHLTTHK IIHTGEKPYS CGICGKSFSD 31 
SSAKRRHCIL HTGEKPYICR KCGRGFSRKS NLIRHQRTHT 
GEKPFQCKTC QRKFSRSDHL KTHTRTHTGE K 

F2607 FQCKTCQRKF SRSDHLKTHT RTHTGEKPYE CDHCGKAFSV 32 
SSNLNVHRRI HTGEKPYKCE ECGKAFRQSS HLTTHKIIHT 
GEKPYSCGIC GKSFSDSSAK RRHCILHTGE K 

F2615 YKCMECGKAF NRRSHLTRHQ RIHTGEKPYT CSDCGKAFRD 33 
KSCLNRHRRT HTGEKPYKCE ECGKAFRQSS HLTTHKIIHT 
GEKPYTCSDC GKAFRDKSCL NRHRRTHTGE K 

F2633 YECEKCGKAF NQSSNLTRHK KSHTGEKPYK CGQCGKFYSQ 34 
VSHLTRHQKI HTGEKPFQCK TCQRKFSRSD HLKTHTRTHT 
GEKPYICRKC GRGFSRKSNL IRHQRTHTGE K 
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Name Amino Acid Sequence 



SEQ ID 
NO: 



F2634 YKCKQCGKAF GCPSNLRRHG RTHTGEKPFQ CKTCQRKFSR 
SDHLKTHTRT HTGEKPYICR KCGRGFSRKS NLIRHQRTHT 
GEKPYKCMEC GKAFNRRSHL TRHQRIHTGE K 

F2636 YKCMECGKAF NRRSHLTRHQ RIHTGEKPYK CEECGKAFRQ 
SSHLTTHKII HTGEKPYKCM ECGKAFNRRS HLTRHQRIHT 
GEKPYVCDVE GCTWKFARSD ELNRHKKRHT GEK 

F2644 YECEKCGKAF NQSSNLTRHK KSHTGEKPYK CMECGKAFNR 
RSHLTRHQRI HTGEKPYKCP DCGKSFSQSS SLIRHQRTHT 
GEKPYKCMEC GKAFNRRSHL TRHQRIHTGE K 

F264 6 YKCEECGKAF RQSSHLTTHK IIHTGEKPYT CSDCGKAFRD 
KSCLNRHRRT HTGEKPFQCK TCQRKFSRSD HLKTHTRTHT 
GEKPYKCKQC GKAFGCPSNL RRHGRTHTGE K 

F2650 YKCEECGKAF RQSSHLTTHK IIHTGEKPYR CEECGKAFRW 
PSNLTRHKRI HTGEKPYKCM ECGKAFNRRS HLTRHQRIHT 
GEKPYRCEEC GKAFRWPSNL TRHKRIHTGE K 

F 26 7 9 — Y ECDHCGKAF SVSSNLNVHR RIHTGEKPYK CMECGK A FNR - 
RSHLTRHQRI HTGEKPYVCD VEGCTWKFAR SDELNRHKKR 
HTGEKPYVCS KCGKAFTQSS NLTVHQKIHT GEK 

F2 610 YICRKCGRGF SRKSNLIRHQ RTHTGEKPYK CMECGKAFNR 
RSHLTRHQRI HTGEKPFQCK TCQRKFSRSD HLKTHTRTHT 
GEKPYKCMEC GKAFNRRSHL TRHQRIHTGE K 



F2612 YKCMECGKAF NRRSHLTRHQ RIHTGEKPFQ CKTCQRKFSR 
SDHLKTHTRT HTGEKPYKCM ECGKAFNRRS HLTRHQRIHT 
GEKPFQCKTC QRKFSRSDHL KTHTRTHTGE K 

F2 638 YICRKCGRGF SRKSNLIRHQ RTHTGEKPYK CGQCGKFYSQ 
VSHLTRHQKI HTGEKPFQCK TCQRKFSRSD HLKTHTRTHT 
GEKPYKCMEC GKAFNRRSHL TRHQRIHTGE K 

F109 YVCDVEGCTW KFARSDELNR HKKRHTGEKP YKCPDCGKSF 
SQSSSLIRHQ RTHTGEKPYK CEECGKAFRQ SSHLTTHKII 
HTGEKPYICR KCGRGFSRKS NLIRHQRTHT GEK 

F2 608 YKCMECGKAF NRRSHLTRHQ RIHTGEKPFQ CKTCQRKFSR 
SDHLKTHTRT HTGEKPYECD HCGKAFSVSS NLNVHRRIHT 
GEKPYKCEEC GKAFRQSSHL TTHKIIHTGE K 

F2611 YKCMECGKAF NRRSHLTRHQ RIHTGEKPYK CMECGKAFNR 
RSHLTRHQRI HTGEKPYRCE ECGKAFRWPS NLTRHKRIHT 
GEKPYKCMEC GKAFNRRSHL TRHQRIHTGE K 



35 



36 



37 



38 



39 



41 



42 



43 



44 



45 



46 
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Name Amino Acid Sequence SEQ ID 

NO: 

F2617 YVCDVEGCTW KFARSDELNR HKKRHTGEKP YKCMECGKAF 47 

NRRSHLTRHQ RIHTGEKPYT CSDCGKAFRD KSCLNRHRRT 

HTGEKPYKCE ECGKAFRQSS HLTTHKIIHT GEK 

F2619 YKCMECGKAF NRRSHLTRHQ RIHTGEKPYE CNYCGKTFSV 48 
SSTLIRHQRI HTGEKPYECE KCGKAFNQSS NLTRHKKSHT 
GEKPFQCKTC QRKFSRSDHL KTHTRTHTGE K 

F2623 YKCEECGKAF RQSSHLTTHK IIHTGEKPYI CRKCGRGFSR 49 
KSNLIRHQRT HTGEKPYRCE ECGKAFRWPS NLTRHKRIHT 
GEKPYVCDVE GCTWKFARSD ELNRHKKRHT GEK 

F2 625 YKCEECGKAF RQSSHLTTHK IIHTGEKPYR CEECGKAFRW 50 
PSNLTRHKRI HTGEKPFQCK TCQRKFSRSD HLKTHTRTHT 
GEKPYVCDVE GCTWKFARSD ELNRHKKRHT GEK 

F2628 YTCKQCGKAF SVSSSLRRHE TTHTGEKPYR CEECGKAFRW 51 
PSNLTRHKRI HTGEKPYICR KCGRGFSRKS NLIRHQRTHT 
GEKPYTCKQC GKAFSVSSSL RRHETTHTGE K 

F2 629 — YKCGQCGKFY SQVSHLTRHQ KIHTGEKPYT CKQCCKAFSV 52 — 

SSSLRRHETT HTGEKPYRCE ECGKAFRWPS NLTRHKRIHT 
GEKPYICRKC GRGFSRKSNL IRHQRTHTGE K 

F2 630 YVCDVEGCTW KFARSDELNR HKKRHTGEKP YKCGQCGKFY 53 
SQVSHLTRHQ KIHTGEKPYT CKQCGKAFSV SSSLRRHETT 
HTGEKPYRCE ECGKAFRWPS NLTRHKRIHT GEK 



F2 635 YKCGQCGKFY SQVSHLTRHQ KIHTGEKPYI CRKCGRGFSR 55 
KSNLIRHQRT HTGEKPYKCG QCGKFYSQVS HLTRHQKIHT 
GEKPFQCKTC QRKFSRSDHL KTHTRTHTGE K 

F2637 FQCKTCQRKF SRSDHLKTHT RTHTGEKPYI CRKCGRGFSR 56 
KSNLIRHQRT HTGEKPYKCM ECGKAFNRRS HLTRHQRIHT 
GEKPYRCEEC GKAFRWPSNL TRHKRIHTGE K 

F2 642 FQCKTCQRKF SRSDHLKTHT RTHTGEKPYK CMECGKAFNR 57 
RSHLTRHQRI HTGEKPYKCK QCGKAFGCPS NLRRHGRTHT 
GEKPFQCKTC QRKFSRSDHL KTHTRTHTGE K 

F2 64 3 YKCMECGKAF NRRSHLTRHQ RIHTGEKPYK CKQCGKAFGC 58 
PSNLRRHGRT HTGEKPFQCK TCQRKFSRSD HLKTHTRTHT 
GEKPYKCKQC GKAFGCPSNL RRHGRTHTGE K 

F264 8 YKCPDCGKSF SQSSSLIRHQ RTHTGEKPYK CGQCGKFYSQ 59 
VSHLTRHQKI HTGEKPYICR KCGRGFSRKS NLIRHQRTHT 
GEKPYICRKC GRGFSRKSNL IRHQRTHTGE K 
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Name Amino Acid Sequence 



SEQ ID 
NO: 



F2651 YECNYCGKTF SVSSTLIRHQ RIHTGEKPYK CEECGKAFRQ 
SSHLTTHKII HTGEKPYRCE ECGKAFRWPS NLTRHKRIHT 
GEKPYKCMEC GKAFNRRSHL TRHQRIHTGE K 

F2653 YECNYCGKTF SVSSTLIRHQ RIHTGEKPYE CEKCGKAFNQ 
SSNLTRHKKS HTGEKPYKCM ECGKAFNRRS HLTRHQRIHT 
GEKPYECEKC GKAFNQSSNL TRHKKSHTGE K 

F2654 YECEKCGKAF NQSSNLTRHK KSHTGEKPYK CMECGKAFNR 
RSHLTRHQRI HTGEKPYECE KCGKAFNQSS NLTRHKKSHT 
GEKPYECDHC GKAFSVSSNL NVHRRIHTGE K 

F2662 YTCSDCGKAF RDKSCLNRHR RTHTGEKPFQ CKTCQRKFSR 
SDHLKTHTRT HTGEKPYECN YCGKTFSVSS TLIRHQRIHT 
GEKPYVCDVE GCTWKFARSD ELNRHKKRHT GEK 

F2667 YKCMECGKAF NRRSHLTRHQ RIHTGEKPYT CSDCGKAFRD 
KSCLNRHRRT HTGEKPFQCK TCQRKFSRSD HLKTHTRTHT 
GEKPYKCMEC GKAFNRRSHL TRHQRIHTGE K 



- F2668 



F2673 



YKCMECGKAF NRRSHLTRHQ RIHTGEKPYK CMECGKAFNR 
RSHLTRHQRI HTGEKPYVCS KCGKAFTQSS NLTVHQKIHT 
GEKPYVCSKC GKAFTQSSNL TVHQKIHTGE K 

FQCKTCQRKF SRSDHLKTHT RTHTGEKPYT CKQCGKAFSV 
SSSLRRHETT HTGEKPYVCD VEGCTWKFAR SDELNRHKKR 
HTGEKPYKCP DCGKSFSQSS SLIRHQRTHT GEK 



F2682 



F2689 



F2697 



F2699 



F2703 



YICRKCGRGF SRKSNLIRHQ RTHTGEKPYK CPDCGKSFSQ 
SSSLIRHQRT HTGEKPYECE KCGKAFNQSS NLTRHKKSHT 
GEKPYKCMEC GKAFNRRSHL TRHQRIHTGE K 

YICRKCGRGF SRKSNLIRHQ RTHTGEKPYS CGICGKSFSD 
SSAKRRHCIL HTGEKPYECE KCGKAFNQSS NLTRHKKSHT 
GEKPYKCEEC GKAFRQSSHL TTHKIIHTGE K 

YKCMECGKAF NRRSHLTRHQ RIHTGEKPYK CKQCGKAFGC 
PSNLRRHGRT HTGEKPYKCE ECGKAFRQSS HLTTHKIIHT 
GEKPYICRKC GRGFSRKSNL IRHQRTHTGE K 

YICRKCGRGF SRKSNLIRHQ RTHTGEKPYK CEECGKAFRQ 
SSHLTTHKII HTGEKPYSCG ICGKSFSDSS AKRRHCILHT 
GEKPYKCMEC GKAFNRRSHL TRHQRIHTGE K 

YKCGQCGKFY SQVSHLTRHQ KIHTGEKPYK CMECGKAFNR 
RSHLTRHQRI HTGEKPYVCD VEGCTWKFAR SDELNRHKKR 
HTGEKPYKCM ECGKAFNRRS HLTRHQRIHT GEK 



60 



61 



62 



63 



64 
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Name 


Amino Acid 


Sequence 






SEQ ID 
NO: 


F2702 


YKCMECGKAF 
VSHLTRHQKI 
GEKPYVCSKC 


NRRSHLTRHQ 
HTGEKPYKCM 
GKAFTQSSNL 


RIHTGEKPYK 
ECGKAFNRRS 
TVHQKIHTGE 


CGQCGKFYSQ 
HLTRHQRIHT 
K 


54 



A polypeptide, e.g., that includes a sequence described above, also include a tag (e.g., 
the HA tag), a NLS, a linker, and a regulatory domain (e.g., an activation or repression 
domain). These elements can be arrange in any order, from N- to C-terminus. In one 
5 example, the polypeptide is arranged as follows: HA tag-NLS-PGEKP-DNA binding 
domain (e.g., a sequence described above)-AAA-p65. Or more particularly: 

MVYPYDWDYAELPPKKKRKVGIRIPGEKP-DNA BINDING DOMAIN- AAA- 
p65; (wherein the leader N-terminal to the DNA binding domain is SEQ ID NO:126) 

1£ : 

• YPYDVPDYA (3-12 of SEQ ID NO: 126) is an exemplary tag (here the HA-tag) 

• PPKKKRKV (15-21 of SEQ ED NO:126) is an exemplary NLS (Nuclear localization 
signal) 

• "ZFP" is an array of zinc finger domains 

15 

In another example, the polypeptide includes the DNA binding domain and a 
repression domain, e.g., a KJRAB or KOX domain. 

Nucleic acid encoding a polypeptide described in this example can be producing 
using any choice of codons, e.g., codons useful (e.g., optimized) for prokaryotic expression, 
20 codons usefixl (e.g., optimized) for eukaryotic expression, or codons that encode 
corresponding naturally occurring domains. 

Results indicate that a number of zinc finger can activate VEGF-A production. 



Table 13: VEGF-A Activation 



ZFPID 


Fold Activation 


ZFPID 


Fold Activation 


ZFPID 


Fold Activation. 


F109 


3.5 


F2625 


2.1 


F2653 


2.6 


F121 


4.4 


F2628 


1.8 


F2654 


2.3 


F435 


12.5 


F2629 


3.8 


F2668 


2.1 


F475 


11.1 


F2630 


2.0 


F2673 


3.1 
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F480 


9.2 


F2633 


11.9 


F2679 


4.5 


F625 


9.0 


F2634 


6.5 


F2682 


3.4 


F2604 


4.9 


F2635 


2.8 


F2689 


2.1 


F2605 


10.9 


F2636 


13.3 


F2697 


1.9 


F2607 


5.4 


F2638 


5.7 


F2699 


1.9 


F2608 


2.1 


F2642 


3.6 


F2702 


3,1 


F2610 


7.4 


F2643 


3.6 


F2703 


4.5 


F2612 


6.3 


F2644 


10.7 


F2825 


2.8 


F2615 


8.1 


F2646 . 


10.2 


F2828 


8.8 


F2617 


2.3 


F2648 


2.3 


F2830 


6.8 


F2619 


2.3 


F2650 


12.6 


F2838 


5.8 


F2623 


2.3 


F2651 


4.9 






Irrelevant 




Parental 








ZFP 


1.1 


vector P3 


1.0 







Example 8: VEGF-A Production by an encapsulated cells 

A nucleic acid construct that includes a coding region encoding the F435-p65 zinc 
finger protein operably linked to a doxycycline-inducible promoter was stably transfected 
5 into Flp-T-Rex293 cells. 

Human embryonic kidney (HEK) cell lines stably expressing ZFP-TFs were 
generated as follows: Plasmids encoding ZFP-TFs were stably introduced into FlpTRex-293 
cell lines (Invitrogen) essentially as described m the manufacturer's protocol. Briefly, the 
HindBI JChol fragment from the pLFD-p65 vectors, which contain DNA segments that 

10 encode ZFP-TFs, were subcloned into pcDNA5/FRT/TO (Invitrogen). The resulting plasmids 
were cotransfected along with pOG44 (Invitrogen) into Flp-In™ TRex™ -293 cells to induce 
a site-specific integration event. Stable integrants were then screened. The resulting cell lines 
expressed zinc finger protein conditionally upon the addition of Doxycyclin (lug/ml) into 
culture medium. To immobilize the cells stably expressing F435-p65, Sodium Alginate 

1 5 (Sigma) was dissolved in PBS to a final concentration of 1 % (wt/v) and gently mixed with 
cells to a final density of 10 6 (cells/ml). The suspension of cells were added dropwise into 
CaC12 (lOOmM) where cellular capsules were solidified for 15 min, then were washed in 
PBS. The encapsulated cells were added to culture medium. Expression was induced with 
1 ^ig/ml doxycycline and the amount of VEGF-A produced by the encapsulated cells was 

20 measured. In one experiment with F435-p65, the cells (cell line #151) grown in the presence 
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of doxycycline produced at least 600 pg/mL of VEGF-A after 2 days, at least 4000 pg/mL 
after three days, about 5000 pg/mL at four days, and at least 5300 pg/mL at five days. 
VEGF-A production was at least 5, 10, 50, or 100 fold greater than controls that did not 
include the F435-p65 zinc finger protein or cells that were not grown in the presence of 
5 doxycycline. 

Example 9: Cell-Based Assay for human VEGF-A expression 

The 3xl0 4 HEK293T cells were transfected with 100 ng of each pLFD-4F-p65 
plasmid in 96-culture plates precoated with poly-L-lysine (Biocoat). The culture supernatants 
were harvested at 48 hours post transfection and stored immediately at -80°C until they were 

1 0 used. The transfection efficiency was estimated at a well of each plate transfected with 
100 ng of lacZ, by staining withX-gal. The calculated transfection efficiencies varied in a 
range of 70-80% in each experiment 

The production of VEGF-A was analyzed by measuring secreted VEGF-A protein by 
sandwich ELISA. The capture antibody(AF-293-NA from R&D Systems), biotinylated 

15 detection antibody (BAF293 from R&D Systems) were purchased from R&D systems, 
streptavidin-AP (SA1 10) and substrate buffer (ES01 1) from Chemicon, substrate pNPP 
(N-9389) from Sigma Aldrich. The ELISA procedures were carried out with automated 
workstation (GENESIS RSP~150™, TECAN). The optical density (OD) at 405 nm was 
measured (POWERWAVE™ X340, BioTek Instrument Inc.) and the quantity of VEGF-A 

20 was calculated from standard curve obtained from the OD values of serially diluted 

recombinant human VEGF-A protein (R&D systems). Relative VEGF-A production was 
calculated by normalizing VEGF-A concentrations obtained from cultures individually 
transfected with pLFD-4F-p65 to that obtained from cultures transfected with the parental 
vector p3. 

25 Example 10: Cell-Based Assay for human VEGF-A expression 

The zinc finger protein F121 consisted of three human zinc finger domains designed 
to bind 9 bp sequences of human VEGF promoter at about nucleotide +434 relative to the 
transcription initiation site of human VEGF-A gene; F109 consisted of four human zinc 
finger domains designed to bind a 12 bp sequence of human VEGF promoter at about 
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the -536 nucleotide relative to the transcription initiation site of human VEGF-A gene; and 
F435 consisted of three human zinc finger domains designed to bind 9 bp sequences at the 
positions -90R and -391R (wherein R means reverse strand) of human VEGF-A gene. 

5 Construction of lucif erase reporter plasmids containing human VEGF promoter 

The native human VEGF promoter DNA (at position -950 to +450, numbering 
relative to the transcription initiation sequence shown in FIG. 1 A, B, C) was PCR-amplified 
from human genomic DNA using sequence specific primers and cloned into the KpnI/XhoI 
restriction site of plasmid pGL3(Promega, E175 1), and the resulting plasmid was designated 

10 pGL3-VEGFprom (Fig. 5B). 



Repression of the luciferase reporter containing native human VEGF promoter by Zinc 
finger protein 

293 cells were transfected with luciferase reporter plasmid pGL3-VEGFprom 
1 5 containing native human VEGF promoter(-950 to +450 from the transcription initiation site) 

and 30 ng of pLFD-F121-KRAB or pLFD-F 1 09-KRAB . Luciferase activity was measured 
~~ as descnb^." Fold repression values were calculated by normalizing the firefly luciferase 
activity against the renilla luciferase activity and the result was compared with that of the 
control wherein 293 cells were transfected with the control vector pLFD and the reporter 
20 plasmid. 

The plasmids encoding F121-KRAB (30 ng) and F109-KRAB (30 ng) reduced the 
reporter activity 8.7 fold and 6.1 fold, respectively. 

Repression of endogenous VEGF-A mRNA expression by ZFP-KRAB 
25 ZFP expression plasmids were transfected into human embryonic kidney 293F cells 

(Gibco Life Technologies). 293F cells allow for high transfection efficiencies. 

293F cells were precultured in the wells of a 24-well culture plate, at a density of 10 5 
cells/well, in 1 ml of DMEM supplemented with 10 % FBS for 24 h in a humid atmosphere 
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containing 5 % C0 2 at 37°C. The cells were transfected with 0, 200, or 400 ng of plasmids 
encoding chimeric zinc finger proteins of interest using a LEPOFECTAMINE PLUS™ (Life 
Technologies). The total amount of DNA was adjusted to 400 ng by adding the parental 
vector as a control if less than 400 ng of the zinc finger protein expression vector was used 
5 The cells were further incubated for 48 hours. The total RNA was extracted from the cells 
with the TRIZOL® reagent (Gibco Life Technologies). 

Quantification ofVEGF mRNA was carried out by the following real time RT-PCR 

The reverse transcription reactions were performed with 4 p.g of the total RNA using 
oligo-dT as the first-strand synthesis primer for mRNA, dNTP and MMLV reverse 

10 transcriptase provided in the Superscript first-strand synthesis system (Gibco Life 

Technologies) to obtain a first-strand cDNA. To analyze mRNA quantities, 1 jjlI of the first- 
strand cQNA thus obtained was amplified by real time PCR using VEGF-A cDNA specific 

prim e rs (Forward primer S^CGGGGTACCCCCTCCCAGTCACTGACTAACO^ , SEQ ID 

NO: 127) and (Reverse primer 5'-CCGCTCGAGTCCGGCGGTCACCCCCAAAAG-3'; SEQ ID 

15 NO: 128). Since tins method is sensitive to the initial amount of RNA, the initial RNA 
amounts were normalized to the GAPDH mRNA quantities calculated by specific 

amplification us i ng GAPDH-sp e cifi c primers, T he amplification of VEGF and GAPDH- 

specific cDNAs was monitored and analyzed in real-time with a QUANTITECT SYBR™ kit 
(QIAGEN, Valencia, CA) and ROTORGENE™ 2000 real-time cycler (Corbett, Sydney, 

20 Australia), and the cDNAs were quantified by serial dilution of the standards included in the 
reactions. 



Repression of VEGF-A mRNA synthesis by zinc finger proteins 

The expression of endogenous VEGF-A mRNA was reduced 2.2 fold (54.5% 
25 repression, 200 ng pLFD-F435-KRAB) and 4.1 fold (75.6% repression, 400 ng pLFD-F435- 
KRAB) relative to untreated control cells. These results show a dose dependant effect. 
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Repression ofVEGF-A protein production by ZFP (F43 5-KRAB) 

In order to examine whether the repression of VEGF-A mRNA expression resulted 
in the reduction of VEGF-A protein secretion, 293F cells (10 4 /96 well plate) were transacted 
with 0 to 200 ng of ZFP expression plasmids(pLFD-F435-KRAB) and cultured for 72 hours. 

5 VEGF protein that accumulated in the culture medium (lOOul each) and dilutions of a 
recombinant human VEGF-A protein standards were analyzed using sandwich ELISA 
(enzyme linked immunosorbent assay), wherein the supernatant of culture was reacted with a 
anti-human VEGF antibody (R&D systems; AF-293-NA), biotinylated anti-human VEGF 
antibody (R&D systems; BAF293), stxeptavidin-alkaline phosphatase. The antigen-antibody 

1 o complex was reacted with pNPP (p-Nitrophenyl phosphate) dissolved in pNPP 

buffer (Chemicon; ES01 1). VEGF-A concentrations in the samples were determined from the 
absorbance at 405 nm which was measured with POWERWAVE™ X340(Bio-TEK 
Instrument Inc., Winooski VT). 

*4.i:>-Ki<A±5 reduced VEGF-A production in a dose dependant manner. When 

15 200 ng of the plasmid was used VEGF-A protein concentration was repressed 3.9 fold 

(138 pg/ml) relative to control cells transfected with a control plasmid, pLFD-F43 5-KRAB 
200 ng. See Table 14. 



Table 14: Titration of F435-KRAB 



Concentration of 


25 


50 


100 


200 


Control 


F43 5-KRAB 










(200 ng) 


plasmid (ng) 










VEGF-A (pg/ml) 


420 ±98 


345 ± 50 


172 ±13 


138 ±14 


536 ±14 


Fold Repression 


1.3 


1.6 


3.1 


3.9 


1.0 



20 Repression of VEGF-A gene induction by hypoxic conditions 

VEGF-A gene is known as a crucial factor for inducing angiogenesis. VEGF-A 

activity is essential for the development and growth of many tumors. VEGF-A activity has 

been found to be stimulated by hypoxia condition in cancer tissues. A high level of VEGF-A 

expression is frequently observed in tumor cells. 
25 . When the medium for culturing 293F cells is treated with 1 00 to 800 yM of CoCl 2 

for about 7 hours, a hypoxia condition is induced and VEGF production by cells is rapidly 
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escalated. The following experiment was carried out in order to examine whether the zinc 
finger protein can inhibit the VEFG expression in the hypoxia condition. 

293F cells(10 4 cells/well, 96-well plate) were transfected with pLFD-F43 5-KRAB 
50 ng and incubated for 48 hours. In order to induce the hypoxic condition, 800 |jlM of C0CI2 

5 was added to the medium at die last 7 hours stage of the culture. The amount of VEGF-A 
secreted in the culture medium was determined by ELISA. 

VEGF production from the hypoxic C0CI2 treated culture with mock-transfected 
cells increased to about 1,039 pg/ml, in contrast to about 273 pg/ml in the untreated control 
cells. This observation confirms that hypoxia strongly induces VEGF-A production. 

10 However, cells transfected with pLFD-F435-KRAB did not induce VEGF-A production in 
hypoxic conditions. These cells produced only about 272pg/ml of VEGF-A, a concentration 
similar to die non-hypoxic control. This results demonstrates that expression of F43 5-KRAB 
inhibits VEGF-A production under hypoxic conditions. Since the transfection rate was only 
abou t 8 5- 90 %, i t is possible that the residual level of VEGF-A production is due to the 

15 untransfected cells in the culture. We concluded that F43 5-KRAB and similarly functional 
chimeric zinc finger proteins are potent repressors of VEGF-A expression. 

T he s e l e cted zinc finger proteins or related proteins that include domains with th e 

same motifs may be used, e.g., as therapeutic agents. Such agents can be, e.g., to repress 
20 VEGF-A expression and thereby retard the growth of tumor cells. 

A number of embodiments of the invention have been described. Nevertheless, it will 
be understood that various modifications may be made without departing from the spirit and 
scope of the invention. Accordingly, other embodiments are within the scope of the 
25 following claims. 
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WHAT IS CLAIMED IS: 

1 . A polypeptide comprising a DNA binding domain that includes a plurality of 

zinc finger domains, wherein 

the DNA binding domain can bind to a site in a VEGF gene, and 

at least two of the zinc finger domains each include respective zinc finger domain 

motife listed in column 2 of Table 1, Table 2, Table 3, Table 4 or Table 5. 



2. The polypeptide of claim 1, wherein the zinc finger domains each include 
respective zinc finger domain motife listed in column 2 of Table 1 or Table 3. 

3. The polypeptide of claim 2, wherein the zinc finger domain is selected from 
the group consisting of the zinc finger domains listed in column 3 of Table 1 or Table 3. 



4. The polypeptide of claim 3, wherein the DNA binding domain includes, in N- 
terminal to C-terminal order, first, second and third zinc finger domains, wherein 

(1) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger domain 
are QSHR, those of the second zinc finger domain are RDHT, and those of the third zinc 
finger domain are RSXiR, wherein Xj is H or N; 

(2) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger domain 
axe QSHX 2 , those of the second zinc finger domain are RX 3 HR, and those of the third zinc 
finger domain are RDHT, wherein X 2 is T or V and X 3 is S or D; 

(3) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger domain 
are RSHR, those of the second zinc finger domain are RDHT, and those of the third zinc 
finger domain are VSNV; 

(4) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger domain 
are RDER, those of the second zinc finger domain are QSSR, and those of the third zinc 
finger domain are QSHT; 

(5) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger domain 
are QSSR, those of the second zinc finger domain axe QSHT, and those of the third zinc 
finger domain are RSNR; 
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(6) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger domain 
are QSNR, those of the second zinc finger domain are QSHR, and those of the third zinc 
finger domain are RDHT; 

(7) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger domain 
are QSHR, those of the second zinc finger domain are RDHT, and those of the third zinc 
finger domain are RSNR; 

(8) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger domain 
are RSHR, those of the second zinc finger domain are QSHT, and those of the third zinc 
finger domain are RSHR; 

(9) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger domain 
are QSHT, those of the second zinc finger domain are RSHR, and those of the third zinc 
finger domain are RDER; 

(10) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger 
doma i n are QSNR, t hose o f the second zinc finger domain are RSHR, and those oi the third 
zinc finger domain are QSSR; 

(1 1) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger 
domain are RSHR, those of the second zinc finger domain are QSSR, and those of the third 

zinc finger domain are RSHR; 

(12) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger 
domain are QSHT, those of the second zinc finger domain are WSNR, and those of the third 
zinc finger domain are RSHR; or 

(13) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger 
domain are WSNR, those of the second zinc finger domain are RSHR, and those of the third 
zinc finger domain are WSNR. 

5. The polypeptide of claim 2, wherein the zinc finger domain is selected from 
the group consisting of the zinc finger domains listed in column 3 of Table 2, Table 4 or 
Tables. 

6. The polypeptide of claim 1 , wherein the VEGF gene is the human VEGF-A 

gene. 
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7. The polypeptide of claim 1, which regulates the VEGF gene expression. 

8. The polypeptide of claim 1, wherein the polypeptide further comprises a 
transcription activation domain, a transcription repression domain, or a protein transduction 
domain. 

9. The polypeptide of claim 8, wherein the transcription activation domain 
comprises p65 or VP16 activation domain. 

10. The polypeptide of claim 8, wherein the transcription repression domain 
comprises Kid or KOX repression domain. 

11. The polypeptide of claim 8, wherein the protein transduction domain is a part 
of TAT protein, VP22 protein, or Antennapedia homeodomain. 

12. A nucleic acid that comprises a sequence encoding the polypeptide of claim 1 . 

— ^ The nucleic acid of claim 12, which compris e s a sequence encoding the 

polypeptide of claim 8. 

14. A modified mammalian cell that contains the polypeptide of claim 1 . 

15. The cell of claim 14, wherein the polypeptide is produced from a nucleic acid 
of claim 14 in the cell. 

16. A pharmaceutical composition for preventing or treating a neoplastic disorder, 
an inflammatory disorder, or an angiogenesis-based disorder, which comprises the 
polypeptide of claim 1, the nucleic acid of claim 12 or the modified mammalian cell of claim 
14, and a pharmaceuticaily acceptable carrier. 
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17. The pharmaceutical composition of claim 16, wherein the neoplastic disorder 
is a cancer. 

1 8. The pharmaceutical composition of claim 16, wherein the zinc finger domain 
included in the polypeptide is selected from the group consisting of the zinc finger domains 
listed in column 3 of Table 1 or Table 3. 

1 9. The pharmaceutical composition of claim 1 8, wherein the polypeptide 
comprises a DNA binding domain that includes, in N-terminal to C-terminal order, first, 
second and third zinc finger domains, wherein 

(1) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger domain 
are QSHR, those of the second zinc finger domain are RDHT, and those of the third zinc 
finger domain are RSXiR, wherein Xi is H or N; 

(2) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger domain 
are QSHX 2 , those of the second zinc finger domain are RX 3 HR, and those of the third zinc 
finger domain are RDHT, wherein X 2 is T or V and X 3 is S or D; 

(3) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger domain 

are RSHR, those of the second zinc fingei domain a r e RDHT, and those of the third zinc 

finger domain are VSNV; 

(4) the DNA contacting residues at positions-1, 2, 3, and 6 of first zinc finger domain 
are RDER, those of the second zinc finger domain are QSSR, and those of the third zinc 
finger domain are QSHT; 

(5) the DNA contacting residues at positions- 1, 2, 3, and 6 of first zinc finger domain 
are QSSR, those of the second zinc finger domain are QSHT, and those of theithird zinc 
finger domain are RSNR; 7 '"V 

(6) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger domain-. . 
are QSNR, those of the second zinc finger domain are QSHR, and those of the third zinc 

finger domain are RDHT; 

(7) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger domain 
are QSHR, those of the second zinc finger domain are RDHT, and those of the third zinc 
finger domain are RSNR; 
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(8) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger domain 
are RSHR, those of the second zinc finger domain are QSHT, and those of the third zinc 
finger domain are RSHR; 

(9) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger domain 
are QSHT, those of the second zinc finger domain are RSHR, and those of the third zinc 
finger domain are RDER; 

(10) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger 
domain are QSNR, those of the second zinc finger domain are RSHR, and those of the third 
zinc finger domain are QSSR; 

(11) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger 
domain are RSHR, those of the second zinc finger domain are QSSR, and those of the third 
zinc finger domain are RSHR; 

(12) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger 
domain are QSHT, those of the second zinc finger domain are WSNR, and those of the third 
zinc finger domain are RSHR; or 

(13) the DNA contacting residues at positions -1, 2, 3, and 6 of first zinc finger 
domain are WSNR, those of the second zinc finger domain are RSHR, and those of the third 
zinc finger domain ore WSNR. — 

20. An encapsulated composition comprising 

an encapsulation layer composed of a biocompatible material that is 
permeable to proteins having a molecular weight of at least 10 fcDa, and 

recombinant mammalian cells, wherein the cells contain a nucleic acid comprising a 
sequence encoding a chimeric zinc finger protein that regulates production of a secreted 
factor. 

2 1 . The encapsulated composition of claim 20, wherein the secreted factor is 
insulin, an insulin-like growth factor, VEGF, HGF, interferon, interleukin, or a fibroblast 
growth factor. 
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22. A method of regulating VEGF gene expression, which comprises 
introducing the polypeptide of claim 1 or the nucleic acid of claim 12 into a cell. 

23 . The method of claim 22, wherein the polypeptide comprises a transcription 
activation domain, and VEGF gene expression is increased in the cell. 

24. The method of claim 22, wherein the polypeptide comprises a transcription 
repression domain, and VEGF gene expression is decreased in the cell. 

25. The method of claim 22, wherein the VEGF gene is human VEGF-A gene. 

26. The method of claim 22, wherein the cell is a mammalian cell. 
27: The method of claim 26, wherem the cell is a human cell: 

28. A method of modulating angiogenesis in a subject, which comprises 
administering the composition of claim 16 to the subject in an amount 

effective t o reduce angiogenesis in th e subject. 

29. The method of claim 28, wherein the subject is a human that has or is 
suspected of having a cancer. 
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<120> 


Regulatory Zinc Finger Proteins 
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US 60/431.892 


<151> 


2002-12-09 


<160> 


129 


<170> 


Patent In version 3.2 


<210> 


1 


<211> 


23 


<212> 


PRT 


<213> 


Homo sapiens 


<400> 


1 



Tyr Lys Cys Lys Gin Cys Gly Lys Ala Phe Gly Cys Pro Ser Asn Leu 
15 10 15 

Arg Arg His Gly Arg Thr His 

20 - - 



<210> 2 

<211> 23 

<212> PRT 

<213> Homo sapiens 

<400> 2 

Tyr Ser Cys Gly I le Cys Gly Lys Ser Phe Ser Asp Ser Ser Ala Lys 
15 10 15 

Arg Arg His Cys I le Leu His 
20 



<210> 3 

<211> 23 

<212> PRT 

<213> Homo sapiens 

<400> 3 



1 
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Tyr Thr Cys Ser Asp Cys Gly Lys Ala Phe Arg Asp Lys Ser Gys Leu 
15 10 15 

Asn Arg His Arg Arg Thr His 
20 



<210> 4 

<211> 23 

<212> PRT 

<213> Homo sapiens 

<400> 4 

Tyr Lys Cys Gly Gin Cys Gly Lys Phe Tyr Ser Gin Val Ser His Leu 
15 10 15 

Thr Arg His Gin Lys I le His 
20 



<210> 5 

<211> 23 

<212> PRT 

<213> Homo sapiens 

<400> 5 



Tyr Lys Cys Glu Glu Cys Gly Lys Ala Phe Arg Gin Ser Ser His Leu 
15 10 15 

Thr Thr His Lys Me I le His 
20 



<210> 6 

<211> 23 

<212> PRT 

<213> Homo sapiens 

<400> 6 

Tyr Glu Cys Glu Lys Cys Gly Lys Ala Phe Asn Gin Ser Ser Asn Leu 



5 



10 



15 



Thr Arg His 



Lys Lys Ser His 
20 
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<210> 7 

<211> 23 

<212> PRT 

<213> Homo sapiens 

<400> 7 

Tyr Val Cys Ser Lys Cys Gly Lys Ala Phe Thr Gin Ser Ser Asn Leu 
15 10 15 

Thr Val His Gin Lys I le His 
20 



<210> 8 

<211> 23 

<212> PRT 

<213> Homo sapiens 

<400> 8 

Tyr Lys Cys Pr o Asp Cys Gly Lys Ser Phe Ser Gin Ser Ser Ser Leu 
1 5 ~ 10 ~ 15 

I le Arg His Gin Arg Thr His 
20 



<210> 9 

<211> 25 

<212> PRT 

<213> Homo sapiens 

<400> 9 

Tyr Val Cys Asp Val Glu Gly Cys Thr Trp Lys Phe Ala Arg Ser Asp 
15 10 15 

Glu Leu Asn Arg His Lys Lys Arg His 
20 25 



<210> 10 

<211> 23 

<212> PRT 

<213> Homo sapiens 

<400> 10 
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Phe G!n Gys Lys Thr Cys Gin Arg Lys Phe Ser Arg Ser Asp His Leu 
1 5 10 15 

Lys Thr His Thr Arg Thr His 
20 



<210> 11 

<211> 23 

<212> PRT 

<213> Homo sapiens 

<400> 11 

Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu 
15 10 15 

Thr Arg His Gin Arg lie His 
20 



<210> 12 

<211> 23 

<212> PRT 

<213> Homo sapiens 

<400> 12 



Tyr lie Cys Arg Lys Cys Gly Arg Gly Phe Ser Arg Lys Ser Asn Leu 
15 10 15 

I le Arg His Gin Arg Thr His 
20 



<210> 13 

<211> 23 

<212> PRT 

<213> Homo sapiens 

<400> 13 

Tyr Glu Cys Asp His Cys Gly Lys Ala Phe Ser Val Ser Ser Asn Leu 
15 10 15 

Asn Val His Arg Arg I le His 
20 



4 



t 

WO 2004/053130 



t 

PCT/KR2003/002693 



<210> 14 

<211> 23 

<212> PRT 

<213> Homo sapiens 

<400> 14 

Tyr Thr Cys Lys Gin Cys Gly Lys Ala Phe Ser Val Ser Ser Ser Leu 
15 10 15 

Arg Arg His Glu Thr Thr His 
20 



<210> 15 

<211> 23 

<212> PRT 

<213> Homo sapiens 

<400> 15 

Tyr Glu Cys Asn Tyr Cys Gly Lys Thr Phe Ser Va l Ser Ser Thr Leu 
~1 5 10" 15 

I le Arg His Gin Arg 1 le His 
20 



<210> 16 

<211> 23 

<212> PRT 

<213> Homo sapiens 

<400> 16 

Tyr Arg Cys Glu Glu Cys Gly Lys Ala Phe Arg Trp Pro Ser Asn Leu 
15 10 15 

Thr Arg His Lys Arg Me His 
20 



<210> 17 

<211> 23 

<212> PRT 

<213> Homo sapiens 



<400> 17 
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Tyr Glu Cys Asp His Gys Gly Lys Ser Phe Ser Gin Ser Ser His Leu 
15 10 15 

Asn Val His Lys Arg Thr His 
20 



<210> 18 

<211> 23 

<212> PRT 

<213> Homo sapiens 

<400> 18 

Phe Leu Cys Gin Tyr Cys Ala Gin Arg Phe Gly Arg Lys Asp His Leu 
15 10 15 

Thr Arg His Met Lys Lys Ser 
20 



<210> 19 

<211> 24 

<212> PRT 

<213> Artificial 

<220> 

<223> Artificial zinc finger domain _ _ 

<400> 19 

Tyr Arg Cys Lys Tyr Cys Asp Arg Ser Phe Ser Asp Ser Ser Asn Leu 
15 10 15 

Gin Arg His Val Arg Asn He His 
20 



<210> 20 

<211> 83 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<400> 20 

Tyr Lys Cys Gly Gin Cys Gly Lys Phe Tyr Ser Gin Val Ser His Leu 
15 10 15 
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Thr Arg His Gin Lys I ie His Thr Giy Glu Lys Pro Phe Gin Gys Lys 
20 25 30 

Thr Gys Gin Arg Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Tyr I le Cys Arg Lys Cys Gly Arg 
50 55 60 

Gly Phe Ser Arg Lys Ser Asn Leu I le Arg His Gin Arg Thr His Thr 
65 70 75 80 

Gly Glu Lys 



<210> 21 

<211> 83 

<212> PRT 

<213> Artificial 

<220> 



<223> artificial zinc finger protein 
<400> 21 

Tyr Lys Cys Glu Glu Cys Gly Lys Ala Phe Arg Gin Ser Ser His Leu 

1 5 10 .1.5 



Thr Thr His Lys He I le His Thr Gly Glu Lys Pro Tyr Lys Cys Met 
20 25 30 

Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin 
35 40 45 

Arg I le His Thr Gly Glu Lys Pro Phe Gin Cys Lys Thr Cys Gin Arg 
50 55 60 

Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr Arg Thr His Thr 
65 70 75 80 

Gly Glu Lys 
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<213> Artificial 
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<220> 

<223> artificial zinc finger protein 
<400> 22 

Tyr Lys Cys Gly Gin Cys Gly Lys Phe Tyr Ser Gin Val Ser His Leu 
15 10 15 

Thr Arg His Gin Lys I le His Thr Gly Glu Lys Pro Phe Gin Cys Lys 
20 25 30 

Thr Cys Gin Arg Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys 
50 55 60 

Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin Arg lie His Thr 
65 70 75 80 

Gly Glu Lys 



<210> 23 
<211> 83 
<212> PRT 

<2J3>_^rijliciaJ 

<220> 

<223> artificial zinc finger protein 
<400> 23 

Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu 
1 5 10 15 

Thr Arg His Gin Arg lie His Thr Gly Glu Lys Pro Phe Gin Cys Lys 
20 25 30 

Thr Cys Gin Arg Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Tyr Glu Cys Asp His Cys Gly Lys 
50 55 60 

Ala Phe Ser Val Ser Ser Asn Leu Asn Val His Arg Arg Me His Thr 
65 70 75 80 

Gly Glu Lys 
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<210> 24 

<211> 84 . 

<212> PRT 

<213> Artificial 



<220> 

<223> artificial zinc finger protein 
<400> 24 

Tyr Glu Cys Asp His Cys Gly Lys Ser Phe Ser Gin Ser Ser His Leu 
15 10 15 

Asn Val His Lys Arg Thr His Thr Gly Glu Lys Pro Phe Leu Cys Gin 
20 25 30 

Tyr Cys Ala Gin Arg Phe Gly Arg Lys Asp His Leu Thr Arg His Met 
35 40 45 

Lys Lys Ser His Thr Gly Glu Lys Pro Phe Gin Cys Lys Thr Cys Gin 
50 55 60 



Arg Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr Arg Thr His 
65 70 75 80 



Thr Gly Glu Lys 
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<211> 83 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<400> 25 

Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu 
15 10 15 

Thr Arg His Gin Arg I le His Thr Gly Glu Lys Pro Phe Gin Cys Lys 
20 25 30 

Thr Cys Gin Arg Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys 
50 55 60 
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Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin Arg lie His Thr 
65 70 75 80 



Gly Glu Lys 



<210> 


26 


<211> 


84 


<212> 


PRT 


<213> 


Artificia 


<220> 




<223> 


artificia 


<400> 


26 



Tyr Lys Cys Lys Gin Cys Gly Lys Ala Phe Gly Cys Pro Ser Asn Leu 
1 5 10 15 

Arg Arg His Gly Arg Thr His Thr Gly Glu Lys Pro Tyr Arg Cys Glu 

20 25 30 



Glu Cys Gly Lys Ala Phe Arg Trp Pro Ser Asn Leu Thr Arg His Lys 
35 40 45 

Arg I le His Thr Gly Glu Lys Pro Phe Leu Cys Gin Tyr Cys Ala Gin 

50 55 6D 



Arg Phe Gly Arg Lys Asp His Leu Thr Arg His Met Lys Lys Ser His 
65 70 75 80 

Thr Gly Glu Lys 



<210> 27 

<211> 83 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 

<400> 27 

Tyr Lys Cys Lys Gin Cys Gly Lys Ala Phe Gly Cys Pro Ser Asn Leu 
1 5 10 15 
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Arg Arg His Gly Arg Thr His Thr Gly Glu Lys Pro Tyr Arg Cys Glu 
20 25 30 

Glu Cys Gly Lys Ala Phe Arg Trp Pro Ser Asn Leu Thr Arg His Lys 
35 40 45 

Arg I le His Thr Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys 
50 55 60 

Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin Arg I le His Thr 
65 70 75 80 

Gly Glu Lys 



<210> 28 

<211> 85 

<212> PRT 

<213> Artificial 

<220> 



<223> artificial zinc finger protein 
<400> 28 

Tyr Arg Cys Lys Tyr Cys Asp Arg Ser Phe Ser Asp Ser Ser Asn Leu 

J 5„ J.Q.. 15__ 

Gin Arg His Val Arg Asn I le His Thr Gly Glu Lys Pro Tyr Arg Cys 
20 25 30 

Glu Glu Cys Gly Lys Ala Phe Arg Trp Pro Ser Asn Leu Thr Arg His 
35 40 45 

Lys Arg I le His Thr Gly Glu Lys Pro Phe Leu Cys Gin Tyr Cys Ala 
50 55 60 

Gin Arg Phe Gly Arg Lys Asp His Leu Thr Arg His Met Lys Lys Ser 
65 70 75 80 

His Thr Gly Glu Lys 
85 
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<212> PRT 

<213> Artificial 
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<220> 

<223> artificial zinc finger protein 
<400> 29 

Tyr Arg Cys Lys Tyr Cys Asp Arg Ser Phe Ser Asp Ser Ser Asn Leu 
1 5 10 15 

Gin Arg His Val Arg Asn He His Thr Gly Glu Lys Pro Tyr Arg Cys 
20 25 30 

Glu Glu Cys Gly Lys Ala Phe Arg Trp Pro Ser Asn Leu Thr Arg His 
35 40 45 

Lys Arg lie His Thr Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly 
50 55 60 

Lys Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin Arg lie His 
65 70 75 80 

Thr Gly Glu Lys 



<210> 30 
<211> 111 
<212> PRT 
.^13>.-jktjfJ.ciaJ 



<220> 

<223> artificial zinc finger protein 
<400> 30 

Tyr Ser Cys Gly lie Cys Gly Lys Ser Phe Ser Asp Ser Ser Ala Lys 
15 10 15 

Arg Arg His Cys lie Leu His Thr Gly Glu Lys Pro Tyr He Cys Arg 
20 25 30 

Lys Cys Gly Arg Gly Phe Ser Arg Lys Ser Asn Leu I le Arg His Gin 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Phe Gin Cys Lys Thr Cys Gin Arg 
50 55 60 

Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr Arg Thr His Thr 
65 70 75 80 
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Gly Glu Lys Pro Tyr Thr Cys Lys Gin Cys Gly Lys Ala Phe Ser Val 
85 90 95 

Ser Ser Ser Leu Arg Arg His Glu Thr Thr His Thr Gly Glu Lys 
100 105 110 



<210> 31 

<211> 111 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<400> 31 

Tyr Lys Cys Glu Glu Cys Gly Lys Ala Phe Arg Gin Ser Ser His Leu 
15 10 15 

Thr Thr His Lys He lie His Thr Gly Glu Lys Pro Tyr Ser Cys Gly 
20 25 30 



I le Cys Gly Lys Ser Phe Ser Asp Ser Ser Ala Lys Arg Arg His Cys 
35 40 45 

I le Leu His Thr Gly Glu Lys Pro Tyr I le Cys Arg Lys Cys Gly Arg 

50.. 55, ___60 

Gly Phe Ser Arg Lys Ser Asn Leu I le Arg His Gin Arg Thr His Thr 
65 70 75 80 

Giy Glu Lys Pro Phe Gin Cys Lys Thr Cys Gin Arg Lys Phe Ser Arg 
85 90 95 

Ser Asp His Leu Lys Thr His Thr Arg Thr His Thr Gly Glu Lys 
100 105 110 



<210> 32 

<211> 111 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 

<400> 32 
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Phe Gin Cys Lys Thr Cys Gin Arg Lys Phe Ser Arg Ser Asp His Leu 
15 10 15 

Lys Thr His Thr Arg Thr His Thr Gly Glu Lys Pro Tyr Glu Cys Asp 
20 25 30 

His Cys Gly Lys Ala Phe Ser Val Ser Ser Asn Leu Asn Val His Arg 
35 40 45 

Arg lie His Thr Gly Glu Lys Pro Tyr Lys Cys Glu Glu Cys Gly Lys 
50 55 60 

Ala Phe Arg Gin Ser Ser His Leu Thr "Thr His Lys lie lie His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Ser Cys Gly lie Cys Gly Lys Ser Phe Ser Asp 
85 90 95 

Ser Ser Ala Lys Arg Arg His Cys lie Leu His Thr Gly Glu Lys 
100 105 110 



<210> 33 

<211> 111 

<212> PRT 

<213> Artificial 

_<220> 

<223> artificial zinc finger protein 

<400> 33 

Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu 
15 10 15 

Thr Arg His Gin Arg I le His Thr Gly Glu Lys Pro Tyr Thr Cys Ser 
20 25 30 

Asp Cys Gly Lys Ala Phe Arg Asp Lys Ser Cys Leu Asn Arg His Arg 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Tyr Lys Cys Glu Glu Cys Gly Lys 
50 55 60 

Ala Phe Arg Gin Ser Ser His Leu Thr Thr His Lys I le lie His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Thr Cys Ser Asp Cys Gly Lys Ala Phe Arg Asp 
85 90 95 
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Lys Ser Cys Leu Asn Arg His Arg Arg Thr His Thr Gly Glu Lys 
100 105 110 



<210> 34 

<211> 111 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<400> 34 

Tyr Glu Cys Glu Lys Cys Gly Lys Ala Phe Asn Gin Ser Ser Asn Leu 
1 5 10 15 

Thr Arg His Lys Lys Ser His Thr Gly Glu Lys Pro Tyr Lys Cys Gly 
20 25 30 

Gin Cys Gly Lys Phe Tyr Ser Gin Val Ser His Leu Thr Arg His Gin 
35 40 45 



Lys I le His Thr Gly Glu Lys Pro Phe Gin Cys Lys Thr Cys Gin Arg 
50 55 60 

Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr Arg Thr His Thr 

65.. 70 75 80 

Gly Glu Lys Pro Tyr I le Cys Arg Lys Cys Gly Arg Gly Phe Ser Arg 
85 90 95 

Lys Ser Asn Leu lie Arg His Gin Arg Thr His Thr Gly Glu Lys 
100 105 1 10 



<210> 35 

<211> 111 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<400> 35 

Tyr Lys Cys Lys Gin Cys Gly Lys Ala Phe Gly Cys Pro Ser Asn Leu 
15 10 15 
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Arg Arg His Gly Arg Thr His Thr Gly Glu Lys Pro Phe Gin Cys Lys 
20 25 30 

Thr Cys Gin Arg Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Tyr lie Cys Arg Lys Cys Gly Arg 
50 55 60 

Gly Phe Ser Arg Lys Ser Asn Leu lie Arg His Gin Arg Thr His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg 
85 90 95 

Arg Ser His Leu Thr Arg His Gin Arg lie His Thr Gly Glu Lys 
100 105 no 



<210> 36 

<211> 113 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 

<400>._36 



Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu 
15 10 15 

Thr Arg His Gin Arg lie His Thr Gly Glu Lys Pro Tyr Lys Cys Glu 
20 25 30 

Glu Cys Gly Lys Ala Phe Arg Gin Ser Ser His Leu Thr Thr His Lys 
35 40 45 

lie lie His Thr Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys 
50 55 60 

Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin Arg lie His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Val Cys Asp Val Glu Gly Cys Thr Trp Lys Phe 
85 90 95 

Ala Arg Ser Asp Glu Leu Asn Arg His Lys Lys Arg His Thr Gly Glu 
100 105 no 



WO 2004/053130 



f 

PCT/KR2003/002693 



Lys 



<210> 37 

<211> 111 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 



<400> 37 

Tyr Giu Cys Glu Lys Cys Giy Lys Ala Phe Asn Gin Ser Ser Asn Leu 
1 5 -10 15 

Thr Arg His Lys Lys Ser His Thr Gly Glu Lys Pro Tyr Lys Cys Met 
20 25 30 

Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin 

35 40 45 

Arg I le His Thr Gly Glu Lys Pro Tyr Lys Cys Pro Asp Cys Gly Lys 
50 55 60 

Ser Phe Ser Gin Ser Ser Ser Leu Me Arg His Gin Arg Thr His Thr 

--§§- ~_ 70 7& 80- 

Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg 
85 90 95 

Arg Ser His Leu Thr Arg His Gin Arg lie His Thr Gly Glu Lys 
100 105 110 



<210> 38 

<211> 111 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<400> 38 

Tyr Lys Cys Glu Glu Cys Gly Lys Ala Phe Arg Gin Ser Ser His Leu 
15 10 15 
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Thr Thr His Lys lie lie His Thr Gly 6lu Lys Pro Tyr Thr Cys Ser 
20 25 30 

Asp Cys Gly Lys Ala Phe Arg Asp Lys Ser Cys Leu Asn Arg His Arg 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Phe Gin Cys Lys Thr Cys Gin Arg 
50 55 60 

Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr Arg Thr His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Lys Cys Lys Gin Cys Gly Lys Ala Phe Gly Cys 
85 90 95 

Pro Ser Asn Leu Arg Arg His Gly Arg Thr His Thr Gly Glu Lys 
100 105 no 



<210> 39 

<211> 111 

<212> PRT 

<213> Artificial 



<220> 

<223> artificial zinc finger protein 

-<:4IB>- 29 



Tyr Lys Cys Glu Glu Cys Gly Lys Ala Phe Arg Gin Ser Ser His Leu 
15 10 15 

Thr Thr His Lys I le I le His Thr Gly Glu Lys Pro Tyr Arg Cys Glu 
20 25 30 

Glu Cys Gly Lys Ala Phe Arg Trp Pro Ser Asn Leu Thr Arg His Lys 
35 40 45 

Arg lie His Thr Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys 
50 55 60 

Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin Arg I le His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Arg Cys Glu Glu Cys Gly Lys Ala Phe Arg Trp 
85 90 95 

Pro Ser Asn Leu Thr Arg His Lys Arg lie His Thr Gly Glu Lys 
100 105 110 
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<210> 40 

<211> 113 

<212> PRT 

<213> Artificial 



<22ff> 

<223> artificial zinc finger protein 
<400> 40 

Tyr Glu Gys Asp His Cys Gly Lys Ala Phe Ser Val Ser Ser Asn Leu 
15 10 15 

Asn Val His Arg Arg I le His Thr Gly Glu Lys Pro Tyr Lys Cys Met 
20 25 30 

Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin 
35 40 45 

Arg He His Thr Gly Glu Lys Pro Tyr Val Cys Asp Val Glu Gly Cys 
50 55 60 



Thr Trp Lys Phe Ala Arg Ser Asp Glu Leu Asn Arg His Lys Lys Arg 
65 70 75 80 

His Thr Gly Glu Lys Pro Tyr Val Cys Ser Lys Cys Gly Lys Ala Phe 
85 90 95 



Thr Gin Ser Ser Asn Leu Thr Val His Gin Lys lie His Thr Gly Glu 
100 105 110 

Lys 



<210> 41 

<211> 111 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<400> 41 

Tyr He Cys Arg Lys Cys Gly Arg Gly Phe Ser Arg Lys Ser Asn Leu 
1-5 10 15 

He Arg His Gin Arg Thr His Thr Gly Glu Lys Pro Tyr Lys Cys Met 
20 25 30 



19 



( 

WO 2004/053130 



Lys Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin 
40 45 

Thr Gly Glu Lys Pro Phe Gin Cys Lys Thr Cys Gin Arg 
55 60 

Arg Ser Asd His Leu Lys Thr His Thr Arg Thr His Thr 
70 75 80 

Pro Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg 
85 go 95 

Leu Thr Arg His Gin Arg lie His Thr Gly Glu Lys 
100 105 no 



<210> 42 
<211> 111 
<212> PRT 
<213> Artificial 

<220> 

<223> artificial zinc finger protein 

<400> 42 

Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu 



X 5_ 10 15 

Thr Arg His Gin Arg lie His Thr Gly Glu Lys Pro Phe Gin Cys Lys 
20 25 30 

Thr Cys Gin Arg Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys 
50 55 60 

Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin Arg lie His Thr 
65 70 75 80 

Gly Glu Lys Pro Phe Gin Cys Lys Thr Cys Gin Arg Lys Phe Ser Arg 
85 90 95 

Ser Asp His Leu Lys Thr His Thr Arg Thr His Thr Gly Glu Lys 
100 105 110 

<210> 43 
<211> 111 

20 



Glu Cys Gly 
35 

Arg I le His 
50 



Lys Phe Ser 
65 



Gly Glu Lys 



Arg Ser His 
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<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 



<400> 43 

Tyr I le Cys Arg Lys Cys Gly Arg Gly Phe Ser Arg Lys Ser Asn Leu 
1 5 10 15 

He Arg His Gin Arg Thr His Thr Gly Glu Lys Pro Tyr Lys Cys Gly 
20 25 30 

Gin Cys Gly Lys Phe Tyr Ser Gin Val Ser His Leu Thr Arg His Gin 
35 40 45 

Lys I le His Thr Gly Glu Lys Pro Phe Gin Cys Lys Thr Cys Gin Arg 
50 55 60 

Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr Arg Thr His Thr 

65 __70 75 80 



Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg 
85 90 95 

Arg Ser His Leu Thr Arg His Gin Arg I le His Thr Gly Glu Lys 

-400 J05-- - HO 

<210> 44 

<211> 113 

<212> PRT 

<213> Artificial 

<220> 



<223> artificial zinc finger protein 
<400> 44 

Tyr Val Cys Asp Val Glu Gly Cys Thr Trp Lys Phe Ala Arg Ser Asp 
15 10 15 

Glu Leu Asn Arg His Lys Lys Arg His Thr Gly Glu Lys Pro Tyr Lys 
20 25 30 

Cys Pro Asp Cys Gly Lys Ser Phe Ser Gin Ser Ser Ser Leu I le Arg 
35 40 45 
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His Gin Arg Thr His Thr Gly Glu Lys Pro Tyr Lys Cys Glu Glu Cys 
50 55 60 

Gly Lys Ala Phe Arg Gin Ser Ser His Leu Thr Thr His Lys lie Me 
65 70 75 80 

His Thr Gly Glu Lys Pro Tyr lie Cys Arg Lys Cys Gly Arg Gly Phe 
85 90 95 

Ser Arg Lys Ser Asn Leu I le Arg His Gin Arg Thr His Thr Gly Glu 
100 105 110 

Lys 



<210> 45 

<211> 111 

<212> PRT 

<213> Artificial 

<220> 



<223> artificial zinc finger protein 
<400> 45 

Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu 

-J— ----- 5 1Q is 

Thr Arg His Gin Arg I le His Thr Gly Glu Lys Pro Phe Gin Cys Lys 
20 25 30 

Thr Cys Gin Arg Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Tyr Glu Cys Asp His Cys Gly Lys 
50 55 60 

Ala Phe Ser Val Ser Ser Asn Leu Asn Val His Arg Arg lie His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Lys Cys Glu Glu Cys Gly Lys Ala Phe Arg Gin 
85 90 95 

Ser Ser His Leu Thr Thr His Lys lie He His Thr Gly Glu Lys 
100 105 110 



<210> 46 
<211> 111 
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<212> PRT 
<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<400> 46 

Tyr Lys Gys Met 6lu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu 
15 10 15 

Thr Arg His Gin Arg lie His Thr Gly Glu Lys Pro Tyr Lys Cys Met 
20 25 30 

Glu Gys Gly Lys Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin 
35 40 45 

Arg I le His Thr Gly Glu Lys Pro Tyr Arg Cys Glu Glu Cys Gly Lys 
50 55 60 

Ala Phe Arg Trp Pro Ser Asn Leu Thr Arg His Lys Arg lie His Thr 

65 • 70 75 80_ 

Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg 
85 90 95 

Arg Ser His Leu Thr Arg His Gin Arg I le His Thr Gly Glu Lys 

100 J05 14C-- 



<210> 47 

<211> 113 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<40O> 47 

Tyr Val Cys Asp Val Glu Gly Cys Thr Trp Lys Phe Ala Arg Ser Asp 
1 5 10 15 

Glu Leu Asn Arg His Lys Lys Arg His Thr Gly Glu Lys Pro Tyr Lys 
20 25 30 

Cys Met Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu Thr Arg 
35 40 45 
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His Gin Arg lie His Thr Gly Glu Lys Pro Tyr Thr Cys Ser Asp Cys 
50 55 60 

Gly Lys Ala Phe Arg Asp Lys Ser Cys Leu Asn Arg His Arg Arg Thr 
65 70 75 80 

His Thr Gly Glu Lys Pro Tyr Lys Cys Glu Glu Cys Gly Lys Ala Phe 
85 90 95 

Arg Gin Ser Ser His Leu Thr Thr His Lys lie Me His Thr Gly Glu 
100 105 110 



Lys 




<210> 


48 


<2U> 


111 


<212> 


PRT 


<213> 


Artificial 


<220> 




<223> 


artificial 



<400> 48 

Tyr Lys Cys Met Giu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu 
15 10 15 



Thr Arg His Gin Arg lie His Thr Gly Glu Lys Pro Tyr Glu Cys Asn 
20 25 30 

Tyr Cys Gly Lys Thr Phe Ser Val Ser Ser Thr Leu lie Arg His Gin 
35 40 45 

Arg lie His Thr Gly Glu Lys Pro Tyr Glu Cys Glu Lys Cys Gly Lys 
50 55 60 

Ala Phe Asn Gin Ser Ser Asn Leu Thr Arg His Lys Lys Ser His Thr 
65 70 75 80 

Gly Glu Lys Pro Phe Gin Cys Lys Thr Cys Gin Arg Lys Phe Ser Arg 
85 90 95 

Ser Asp His Leu Lys Thr His Thr Arg Thr His Thr Gly Glu Lys 
100 105 110 



<210> 49 
<211> 113 
<212> PRT 
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<213> Artificial 
<220> 

<223> artificial zinc finger protein 
<400> 49 

Tyr Lys Cys 6lu Glu Cys Gly Lys Ala Phe Arg Gin Ser Ser His Leu 
15 10 15 

Thr Thr His Lys lie He His Thr Gly Glu Lys Pro Tyr lie Cys Arg 
20 25 30 

Lys Cys Gly Arg Gly Phe Ser Arg Lys Ser Asn Leu I le Arg His Gin 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Tyr Arg Cys Glu Glu Cys Gly Lys 
50 55 60 

Ala Phe Arg Trp Pro Ser Asn Leu Thr Arg His Lys Arg I le His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Val Cys Asp Val Glu Gly Cys Thr Trp Lys Phe 
85 90 95 

Ala Arg Ser Asp Glu Leu Asn Arg His Lys Lys Arg His Thr Gly Glu 
100 105 110 



Lys 



<210> 50 

<211> 113 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<400> 50 

Tyr Lys Cys Glu Glu Cys Gly Lys Ala Phe Arg Gin Ser Ser His Leu 
15 10 15 

Thr Thr His Lys He He His Thr Gly Glu Lys Pro Tyr Arg Cys Glu 
20 25 30 

Glu Cys Gly Lys Ala Phe Arg Trp Pro Ser Asn Leu Thr Arg His Lys 
35 40 45 
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Arg I le His Thr Gly Glu Lys Pro Phe Gin Cys Lys Thr Cys Gin Arg 
50 55 60 

Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr Arg Thr His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Val Cys Asp Val Glu Gly Cys Thr Trp Lys Phe 
85 90 95 

Ala Arg Ser Asp Glu Leu Asn Arg His Lys Lys Arg His Thr Gly Glu 
100 105 110 

Lys 



<210> 51 

<211> 111 

<212> PRT 

<213> Artificial 



<220> 



<223> artificial zinc finger protein 
<400> 51 

Tyr Thr Cys Lys Gin Cys Gly Lys Ala Phe Ser Val Ser Ser Ser Leu 

-J- 5 10 15 

Arg Arg His Glu Thr Thr His Thr Gly Glu Lys Pro Tyr Arg Cys Glu 
20 25 30 

Glu Cys Gly Lys Ala Phe Arg Trp Pro Ser Asn Leu Thr Arg His Lys 
35 40 45 

Arg I le His Thr Gly Glu Lys Pro Tyr I le Cys Arg Lys Cys Gly Arg 
50 55 60 

Gly Phe Ser Arg Lys Ser Asn Leu I le Arg His Gin Arg Thr His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Thr Cys Lys Gin Cys Gly Lys Ala Phe Ser Val 
85 90 95 

Ser Ser Ser Leu Arg Arg His Glu Thr Thr His Thr Gly Glu Lys 
100 105 no 



<210> 52 
<211> 111 
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<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 



<400> 52 

Tyr Lys Gys Gly Gin Gys Gly Lys Phe Tyr Ser Gin Val Ser His Leu 
15 10 15 

Thr Arg His Gin Lys I le His Thr Gly Glu Lys Pro Tyr Thr Cys Lys 
20 25 30 

Gin Cys Gly Lys Ala Phe Ser Vat Ser Ser Ser Leu Arg Arg His Glu 
35 40 45 

Thr Thr His Thr Gly Glu Lys Pro Tyr Arg Cys Glu Glu Cys Gly Lys 
50 55 60 

Ala Phe Arg Trp Pro Ser Asn Leu Thr Arg His Lys Arg I le His Thr 

-65 70 75 80- 

Gly Glu Lys Pro Tyr I le Cys Arg Lys Cys Gly Arg Gly Phe Ser Arg 
85 90 95 

Lys Ser Asn Leu lie Arg His Gin Arg Thr His Thr Gly Glu Lys 

100 ...... 405 HO 



<210> 53 

<211> 113 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<400> 53 

Tyr Val Cys Asp Val Glu Gly Cys Thr Trp Lys Phe Ala Arg Ser Asp 
15 10 15 

Glu Leu Asn Arg His Lys Lys Arg His Thr Gly Glu Lys Pro Tyr Lys 
20 25 30 

Cys Gly Gin Cys Gly Lys Phe Tyr Ser Gin Val Ser His Leu Thr Arg 
35 40 45 
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His Gin Lys I le His Thr Gly Glu Lys Pro Tyr Thr Cys Lys Gin Cys 
50 55 60 

Gly Lys Ala Phe Ser Val Ser Ser Ser Leu Arg Arg His Glu Thr Thr 
65 70 75 80 

His Thr Gly Glu Lys Pro Tyr Arg Cys Glu Glu Cys Gly Lys Ala Phe 
85 90 95 

Arg Trp Pro Ser Asn Leu Thr Arg His Lys Arg I le His Thr Gly Glu 
100 105 110 

Lys 



<210> 54 

<211> 111 

<212> PRT 

<213> Artificial 

<22Q> 



<223> artificial zinc finger protein 
<400> 54 

Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu 

— 1 10 45- — 



Thr Arg His Gin Arg lie His Thr Gly Glu Lys Pro Tyr Lys Cys Gly 
20 25 30 

Gin Cys Gly Lys Phe Tyr Ser Gin Val Ser His Leu Thr Arg His Gin 
35 40 45 

Lys lie His Thr Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys 
50 55 60 

Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin Arg I le His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Val Cys Ser Lys Cys Gly Lys Ala Phe Thr Gin 
85 90 95 

Ser Ser Asn Leu Thr Val His Gin Lys lie His Thr Gly Glu Lys 
100 105 110 



<210> 55 
<211> 111 
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<212> PRT 
<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<400> 55 

Tyr Lys Cys Gly Gin Cys Gly Lys Phe Tyr Ser Gin Val Ser His Leu 
15 10 15 

Thr Arg His Gin Lys I le His Thr Gly Glu Lys Pro Tyr I le Cys Arg 
20 25 30 

Lys Cys Gly Arg Gly Phe Ser Arg Lys Ser Asn Leu lie Arg His Gin 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Tyr Lys Cys Gly Gin Cys Gly Lys 
50 55 60 

Phe Tyr Ser Gin Val Ser His Leu Thr Arg His Gin Lys I le His Thr 

65 • 70 75 80 



Gly Glu Lys Pro Phe Gin Cys Lys Thr Cys Gin Arg Lys Phe Ser Arg 
85 90 95 

Ser Asp His Leu Lys Thr His Thr Arg Thr His Thr Gly Glu Lys 

100 105 110 



<210> 


56 


<211> 


111 


<212> 


PRT 


<213> 


Artificial 


<220> 




<223> 


artificial 


<400> 


56 



Phe 6ln Cys Lys Thr Cys Gin Arg Lys Phe Ser Arg Ser Asp His Leu 
15 10 15 

Lys Thr His Thr Arg Thr His Thr Gly Glu Lys Pro Tyr I le Cys Arg 
20 25 30 

Lys Cys Gly Arg Gly Phe Ser Arg Lys Ser Asn Leu I le Arg His Gin 
35 40 45 
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Arg Thr His Thr Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys 
50 55 60 

Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin Arg I ie His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Arg Cys Glu Glu Gys Gly Lys Ala Phe Arg Trp 
85 90 95 

Pro Ser Asn Leu Thr Arg His Lys Arg I le His Thr Gly Glu Lys 
100 105 no 



<210> 57 

<211> 111 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 

<400> 57 

Phe Gin Cys Lys Thr Cys Gin Arg Lys Phe Ser Arg Ser Asp His Leu 
15 10 15 

Lys Thr His Thr Arg Thr His Thr Gly Glu Lys Pro Tyr Lys Cys Met 

.20 — 25 30 



Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin 
35 40 45 

Arg I le His Thr Gly Glu Lys Pro Tyr Lys Cys Lys Gin Cys Gly Lys 
50 55 60 

Ala Phe Gly Cys Pro Ser Asn Leu Arg Arg His Gly Arg Thr His Thr 
65 70 75 80 

Gly Glu Lys Pro Phe Gin Cys Lys Thr Cys Gin Arg Lys Phe Ser Arg 
85 90 95 

Ser Asp His Leu Lys Thr His Thr Arg Thr His Thr Gly Glu Lys 
100 105 110 



<210> 58 

<211> 111 

<212> PRT 

<213> Artificial 
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<220> 

<223> artificial zinc finger protein 
<400> 58 

Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu 
15 10 15 

Thr Arg His Gin Arg lie His Thr Gly Glu Lys Pro Tyr Lys Cys Lys 
20 25 . 30 

Gin Cys Gly Lys Ala Phe Gly Cys Pro Ser Asn Leu Arg Arg His Gly 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Phe Gin Cys Lys Thr Cys Gin Arg 
50 55 60 

Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr Arg Thr His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Lys Cys Lys Gin Cys Gly Lys .Ala Phe Gly Cys 
85 90 95 

Pro Ser Asn Leu Arg Arg His Gly Arg Thr His Thr Gly Glu Lys 
100 105 110 



j£210>_59 

<211> 111 
<212> PRT 
<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<400> 59 

Tyr Lys Cys Pro Asp Cys Gly Lys Ser Phe Ser Gin Ser Ser Ser Leu 
15 10 15 

I le Arg His Gin Arg Thr His Thr Gly Glu Lys Pro Tyr Lys Cys Gly 
20 25 30 

Gin Cys Gly Lys Phe Tyr Ser Gin Val Ser His Leu Thr Arg His Gin 
35 40 45 

Lys lie His Thr Gly Glu Lys Pro Tyr He Cys Arg Lys Cys Gly Arg 
50 55 60 



31 



f ( 

WO 2004/053130 PCT/KR2003/002693 



Gly Phe Ser Arg Lys Ser Asn Leu lie Arg His Gin Arg Thr His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr I le Cys Arg Lys Cys Gly Arg Gly Phe Ser Arg 
85 90 95 

Lys Ser Asn Leu lie Arg His Gin Arg Thr His Thr Gly Glu Lys 
100 105 no 



<210> 60 

<211> 111 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<400> 60 

Tyr Glu Cys Asn Tyr Cys Gly Lys Thr Phe Ser Val Ser Ser Thr Leu 

J 5 10 15 



lie Arg His Gin Arg I le His Thr Gly Glu Lys Pro Tyr Lys Cys Glu 
20 25 30 

Glu Cys Gly Lys Ala Phe Arg Gin Ser Ser His Leu Thr Thr His Lys 

35 40 45 



lie He His Thr Gly Glu Lys Pro Tyr Arg Cys Glu Glu Cys Gly Lys 
50 55 60 

Ala Phe Arg Trp Pro Ser Asn Leu Thr Arg His Lys Arg I le His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg 
85 90 95 

Arg Ser His Leu Thr Arg His Gin Arg lie His Thr Gly Glu Lys 
100 105 110 



<210> 61 

<211> 111 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
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<400> 61 

Tyr Glu Cys Asn Tyr Cys Gly Lys Thr Phe Ser Val Ser Ser Thr Leu 
15 10 15 

I le Arg His Gin Arg I le His Thr Gly Glu Lys Pro Tyr Glu Cys Glu 
20 25 30 

Lys Cys Gly Lys Ala Phe Asn Gin Ser Ser Asn Leu Thr Arg His Lys 
35 40 45 

Lys Ser His Thr Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys 

50 55 60 

Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin Arg lie His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Glu Cys Glu Lys Cys Gly Lys Ala Phe Asn Gin 
85 90 95 

Ser Ser Asn Leu Thr Arg His Lys Lys Ser His Thr Gly Glu Lys 

100 105 HQ 



<210> 62 

<211> 111 

<212> PRT 
-<243> -Ap44-f-KHa4-- 



<220> 

<223> artificial zinc finger protein 
<400> 62 

Tyr Glu Cys Glu Lys Cys Gly Lys Ala Phe Asn Gin Ser Ser Asn Leu 
1 5 10 15 

Thr Arg His Lys Lys Ser His Thr Gly Glu Lys Pro Tyr Lys Cys Met 
20 25 30 

Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin 
35 40 45 

Arg lie His Thr Gly Glu Lys Pro Tyr Glu Cys Glu Lys Cys Gly Lys 
50 55 60 

Ala Phe Asn Gin Ser Ser Asn Leu Thr Arg His Lys Lys Ser His Thr 
65 70 75 80 
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6ly Glu Lys Pro Tyr Glu Cys Asp His Cys 6ly Lys Ala Phe Ser Val 
85 90 95 

Ser Ser Asn Leu Asn Val His Arg Arg He His Thr Gly Glu Lys 
100 105 110 



<210> 63 

<211> 113 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<400> 63 

Tyr Thr Cys Ser Asp Cys Gly Lys Ala Phe Arg Asp Lys Ser Cys Leu 
1.5 10 15 

Asn Arg His Arg Arg Thr His Thr Gly Glu Lys Pro Phe Gin Cys Lys 

20 25 30 

Thr Cys Gin Arg Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Tyr Glu Cys Asn Tyr Cys Gly Lys 

50 55- 60 

Thr Phe Ser Val Ser Ser Thr Leu lie Arg His Gin Arg lie His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Val Cys Asp Val Glu Gly Cys Thr Trp Lys Phe 
85 90 95 

Ala Arg Ser Asp Glu Leu Asn Arg His Lys Lys Arg His Thr Gly Glu 
100 105 110 

Lys 



<210> 64 

<211> 111 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
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<400> 64 

Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu 
1 5 10 15 

Thr Arg His Gin Arg I le His Thr Gly Glu Lys Pro Tyr Thr Cys Ser 
20 25 30 

Asp Cys Gly Lys Ala Phe Arg Asp Lys Ser Cys Leu Asn Arg His Arg 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Phe Gin Cys Lys Thr Cys Gin Arg 
50 55 60 

Lys Phe Ser Arg Ser Asp His Leu Lys Thr His Thr Arg Thr His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg 
85 90 95 

Arg Ser His Leu Thr Arg His Gin Arg I le His Thr Gly Glu Lys 

IDQ 105 110 



<210> 65 
<211> 111 
<212> PRT 
<216> -Art+f ie iai - 



<220> 

<223> artificial zinc finger protein 
<400> 65 

Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu 
15 10 15 

Thr Arg His Gin Arg I le His Thr Gly Glu Lys Pro Tyr Lys Cys Met 
20 25 30 

Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin 
35 40 45 

Arg lie His Thr Gly Glu Lys Pro Tyr Va) Cys Ser Lys Cys Gly Lys 
50 55 60 

Ala Phe Thr Gin Ser Ser Asn Leu Thr Val His Gin Lys lie His Thr 
65 70 75 80 
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Gly Giu Lys Pro Tyr Val Cys Ser Lys Cys Giy Lys Ala Phe Thr Gin 
85 90 95 

Ser Ser Asn Leu Thr Val His Gin Lys He His Thr Gly Glu Lys 
100 105 110 



<210> 66 

<211> 113 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<400> 66 

Phe Gin Cys Lys Thr Cys Gin Arg Lys Phe Ser Arg Ser Asp His Leu 
1 5 10 15 

Lys Thr His Thr Arg Thr His Thr Gly Glu Lys Pro Tyr Thr Cys Lys 

20 25 39 



Gin Cys Gly Lys Ala Phe Ser Val Ser Ser Ser Leu Arg Arg His Glu 
35 40 45 

Thr Thr His Thr Gly Glu Lys Pro Tyr Val Cys Asp Val Glu Gly Cys 

59 55- —-60 



Thr Trp Lys Phe Ala Arg Ser Asp Glu Leu Asn Arg His Lys Lys Arg 
65 70 75 ' 80 

His Thr Gly Glu Lys Pro Tyr Lys Cys Pro Asp Cys Gly Lys Ser Phe 
85 90 95 

Ser Gin Ser Ser Ser Leu lie Arg His Gin Arg Thr His Thr Gly Glu 
100 105 110 

Lys 



<210> 67 

<211> 111 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
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<400> 67 

Tyr I le Cys Arg Lys Cys Gly Arg Gly Phe Ser Arg Lys Ser Asn Leu 
15 10 15 

lie Arg His Gin Arg Thr His Thr Giy Glu Lys Pro Tyr Lys Cys Pro 
20 25 30 

Asp Cys Gly Lys Ser Phe Ser Gin Ser Ser Ser Leu lie Arg His Gin 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Tyr Glu Cys Glu Lys Cys Gly Lys 
50 55 60 

Ala Phe Asn Gin Ser Ser Asn Leu Thr Arg His Lys Lys Ser His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg 
85 90 95 

Arg Ser His Leu Thr Arg His Gin Arg Me His Thr Gly Glu Lys 

100 105 tia 



<210> 68 
<211> 111 
<212> PRT 
-<£43>--AH :if icia l- 



<220> 

<223> artificial zinc finger protein 
<400> 68 

Tyr I le Cys Arg Lys Cys Gly Arg Gly Phe Ser Arg Lys Ser Asn Leu 
1 5 10 15 

lie Arg His Gin Arg Thr His Thr Gly Glu Lys Pro Tyr Ser Cys Gly 
20 25 30 

Me Cys Gly Lys Ser Phe Ser Asp Ser Ser Ala Lys Arg Arg His Cys 
35 40 45 

Me Leu His Thr Gly Glu Lys Pro Tyr Glu Cys Glu Lys Cys Gly Lys 
50 55 60 

Ala Phe Asn Gin Ser Ser Asn Leu Thr Arg His Lys Lys Ser His Thr 
65 70 75 80 
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Gly Glu Lys Pro Tyr Lys Cys Gig Glu Cys Gly Lys Ala Phe Arg Gin 
85. 90 95 

Ser Ser His Leu Thr Thr His Lys lie lie His Thr Gly Glu Lys 
100 105 110 



<210> 69 

<211> 111 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 
<400> 69 

Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu 
1 5 10 15 

Thr Arg His Gin Arg I le His Thr Gly Glu Lys Pro Tyr Lys Cys Lys 

20 25 30 

Gin Cys Gly Lys Ala Phe Gly Cys Pro Ser Asn Leu Arg Arg His Gly 
35 40 45 

Arg Thr His Thr Gly Glu Lys Pro Tyr Lys Cys Glu Glu Cys Gly Lys 

50- 55 ._BQ 

Ala Phe Arg Gin Ser Ser His Leu Thr Thr His Lys He lie His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr I le Cys Arg Lys Cys Gly Arg Gly Phe Ser Arg 
85 90 95 

Lys Ser Asn Leu lie Arg His Gin Arg Thr His Thr Gly Glu Lys 
100 105 110 



<210> 70 

<211> 111 

<212> PRT 

<213> Artificial 

<220> 

<223> artificial zinc finger protein 

<400> 70 
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Tyr lie Cys Arg Lys Cys Gly Arg Gly Phe Ser Arg Lys Ser Asn Leu 
15 10 15 

I le Arg His Gin Arg Thr His Thr Gly Glu Lys Pro Tyr Lys Cys Glu 
20 25 30 

Glu Cys Gly Lys Ala Phe Arg Gin Ser Ser His Leu Thr Thr His Lys 
35 40 45 

lie lie His Thr Gly Glu Lys Pro Tyr Ser Cys Gly lie Cys Gly Lys 
50 55 60 

Ser Phe Ser Asp Ser Ser Ala Lys Arg Arg His Cys lie Leu His Thr 
65 70 75 80 

Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe Asn Arg 
85 90 95 

Arg Ser His Leu Thr Arg His Gin Arg He His Thr Gly Glu Lys 
100 105 110 



<210> 71 
<211> 113 
<212> PRT 
<213> Artificial 

^220> — 

<223> artificial zinc finger protein 

<400> 71 

Tyr Lys Cys Gly Gin Cys Gly Lys Phe Tyr Ser Gin Val Ser His Leu 
15 10 15 

Thr Arg His Gin Lys I le His Thr Gly Glu Lys Pro Tyr Lys Cys Met 
20 25 30 

Glu Cys Gly Lys Ala Phe Asn Arg Arg Ser His Leu Thr Arg His Gin 
35 40 45 

Arg lie His Thr Gly Glu Lys Pro Tyr Val Cys Asp Val Glu Gly Cys 
50 55 60 

Thr Trp Lys Phe Ala Arg Ser Asp Glu Leu Asn Arg His Lys Lys Arg 
65 70 75 80 

His Thr Gly Glu Lys Pro Tyr Lys Cys Met Glu Cys Gly Lys Ala Phe 
85 90 95 
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Asn Arg Arg Ser His Leu Thr Arg His Gin Arg lie His Thr Gly Glu 
100 105 110 



Lys 



<210> 72 

<211> 96 

<212> PRT 

<213> Homo sapiens 

<400> 72 

Asp Ala Lys Ser Leu Thr Ala Trp Ser Arg Thr Leu Val Thr Phe Lys 
15 10 15 

Asp Val Phe Val Asp Phe Thr Arg Glu Glu Trp Lys Leu Leu Asp Thr 
20 25 30 

Ala Gin Gin He Val Tyr Arg Asn Val Met Leu Glu Asn Tyr Lys Asn 

35 40 45 



Leu Val Ser Leu Gly Tyr Gin Leu Thr Lys Pro Asp Val I le Leu Arg 
50 55 60 

Leu Glu Lys Gly Glu Glu Pro Trp Leu Val Glu Arg Glu lie His Gin 

-65 _ 70-- 75 .80-. 

Glu Thr His Pro Asp Ser Glu Thr Ala Phe Glu Me Lys Ser Ser Val 
85 90 95 



<210> 73 

<211> 260 

<212> PRT 

<213> Homo sapiens 

<400> 73 

Tyr Leu Pro Asp Thr Asp Asp Arg His Arg lie Glu Glu Lys Arg Lys 
1 5 10 15 

Arg Thr Tyr Glu Thr Phe Lys Ser lie Met Lys Lys Ser Pro Phe Ser 
20 25 30 

Gly Pro Thr Asp Pro Arg Pro Pro Pro Arg Arg I le Ala Val Pro Ser 
35 40 45 
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Arg Ser Ser Ala Ser Val Pro Lys Pro Ala Pro Gin Pro Tyr Pro Phe 
50 55 60 

Thr Ser Ser Leu Ser Thr I le Asn Tyr Asp Glu Phe Pro Thr Met Val 
65 70 75 80 

Phe Pro Ser Gly Gin Me Ser Gin Ala Ser Ala Leu Ala Pro Ala Pro 
85 90 95 

Pro Gin Val Leu Pro Gin Ala Pro Ala Pro Ala Pro Ala Pro Ala Met 
100 105 110 

Val Ser Ala Leu Ala Gin Ala Pro Ala Pro Val Pro Val Leu Ala Pro 
115 120 125 

Gly Pro Pro Gin Ala Val Ala Pro Pro Ala Pro Lys Pro Thr Gin Ala 
130 135 140 

Gly Glu Gly Thr Leu Ser Glu Ala Leu Leu Gin Leu Gin Phe Asp Asp 
145 150 155 160 

G l u A s p I o i i Gly Al a L eu Leu Gly Asn Ser Thr A s p Pro Ala Va l Ph e 



165 170 175 

Thr Asp Leu Ala Ser Val Asp Asn Ser Glu Phe Gin Gin Leu Leu Asn 
180 185 190 

■ €rht-64y~H«~Pfo Val Ala-Pro Bis^Fhr Thr Glu Pro -Met- Leu Met Qli r 



195 200 205 

Tyr Pro Glu Ala lie Thr Arg Leu Val Thr Ala Gin Arg Pro Pro Asp 
210 215 220 

Pro Ala Pro Ala Pro Leu Gly Ala Pro Gly Leu Pro Asn Gly Leu Leu 
225 230 235 240 

Ser Gly Asp Glu Asp Phe Ser Ser I le Ala Asp Met Asp Phe Ser Ala 
245 250 255 

Leu Leu Ser Gin 
260 



<210> 74 

<211> 127 

<212> PRT 

<213> Sacharromyces cerevisiae 

<400> 74 
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Asn Phe Asn Gin Ser Gly Asn I le Ala Asp Ser Ser Leu Ser Phe Thr 
15 10 15 

Phe Thr Asn Ser Ser Asn Gly Pro Asn Leu lie Thr Thr Gin Thr Asn 
20 25 30 

Ser Gin Ala Leu Ser Gin Pro lie Ala Ser Ser Asn Val His Asp Asn 
35 40 45 

Phe Met Asn Asn Glu lie Thr Ala Ser Lys lie Asp Asp Gly Asn Asn 
50 55 60 

Ser Lys Pro Leu Ser Pro Gly Trp Thr Asp Gin Thr Ala Tyr Asn Ala 
65 70 75 80 

Phe Gly I le Thr Thr Gly Met Phe Asn Thr Thr Thr Met Asp Asp Val 
85 90 95 

Tyr Asn Tyr Leu Phe Asp Asp Glu Asp Thr Pro Pro Asn Pro Lys Lys 
100 105 110 



G l u I lo S o r Mot Ala Tyr Pro Tyr Aop Val Pro A3p Tyr Ala S e r 



115 120 125 



<210> 75 
<211> 63 

- <212> --PRT- 

<213> Homo sapiens 

<400> 75 

Val Ser Val Thr Phe Glu Asp Val Ala Val Leu Phe Thr Arg Asp Glu 
15 10 15 

Trp Lys Lys Leu Asp Leu Ser Gin Arg Ser Leu Tyr Arg Glu Val Met 
20 25 30 

Leu Glu Asn Tyr Ser Asn Leu Ala Ser Met Ala Gly Phe Leu Phe Thr 
35 40 45 

Lys Pro Lys Val lie Ser Leu Leu Gin Gin Gly Glu Asp Pro Trp 
50 55 60 



<210> 76 

<211> 12 

<212> DNA 

<213> Homo sapiens 
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<400> 76 
gtttgggagg tc 



12 



<210> 77 

<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 77 

tgggaggtca ga 12 



<210> 78 

<211> 12 

<212> DNA 

<213> Homo sapiens 



<210> 79 

<211> 12 

<212> DNA 

<213> Homo sapiens 

< 4 00> 79 - 

gccagagccg gg 12 

<210> 80 

<211> 12 

<212> DNA 

<213> Homo sapiens 



<210> 81 

<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 81 

ggggagaggg ac 12 



<400> 78 
gtcagaaata gg 



12 



<400> 80 
gagcggggag aa 



12 



<210> 82 
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<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 82 

gtggggagag gg 12 



<210> 83 

<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 83 

ggggcagggg aa 12 



<210> 84 

<211> 12 

<212> DNA 

<213> Homo sapiens 



<400> 84 

gacagggcct ga 12 



<210> 85 
<211> 12 



<212> ONA 

<213> Homo sapiens 

<400> 85 

ggtgggggtc ga 12 



<210> 86 

<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 86 

caagtgggga at 12 



<210> 87 

<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 87 
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gggtgggggg ag 



12 



<210> 88 

<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 88 

agggggtggg gg 12 

<210> 89 

<211> 12 

<212> ONA 

<213> Homo sapiens 



<210> 90 

<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 90 



<210> 91 

<211> 12 

<212> ONA 

<213> Homo sapiens 

<400> 91 

agaaataggg gg 12 

<210> 92 

<211> 12 

<212> DNA 

<213> Homo sapiens 



<400> 89 
gggtggggag ag 



12 



i -eg. 



.12 



<400> 92 
gggggtgggg gg 



12 



<210> 93 
<211> 12 
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<212> ONA 

<213> Homo sapiens 

<400> 93 

agagccgggg tg ^2 

<210> 94 

<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 94 

agggaagctg gg 12 



<210> 95 

<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 95 

gtgggtgagt ga 12 



<210> 96 
<211> 12 
<212> DNA 



<213> Homo sapiens 
<400> 96 

gtgtggggtt ga -,2 



<210> 97 

<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 97 

gttgagggtg tt 12 

<210> 98 

<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 98 

gagggtgttg ga 12 
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<210> 99 

<211> 12 

<212> ONA 

<213> Homo sapiens 

<400> 99 

ggtgttggag eg 12 



<210> 100 

<211> 12 

<212> ONA 

<213> Homo sapiens 

<400> 100 

ggggagaggg ac 12 



<210> 101 
<211> 12 
<212> ONA 

<213> Homo sapiens 

<400> 101 

tggggagagg ga 12 



- <210> 102 



<211> 12 

<212> ONA 

<213> Homo sapiens 



<400> 102 

ggtggggaga gg 12 



<210> 103 

<211> 12 

<212> ONA 

<213> Homo sapiens 

<400> 103 

agggaegggt gg 12 



<210> 104 

<211> 12 

<212> ONA 

<213> Homo sapiens 
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<400> 104 

gacagggacg gg 12 

<210> 105 

<211> 12 

<212> ONA 

<213> Homo sapiens 

<400> 105 

gaggagggag ca 12 

<210> 106 

<211> 12 

<212> ONA 

<213> Homo sapiens 

<4O0> 106 

gggggtcgag ct 12 



<210> 107 
<211> 12 
<212> DNA 
<213> Homo sapiens 

- <4 00>__.1jD7 

gaaggggaag ct 12 

<210> 108 

<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 108 

aatgaagggg aa 12 

<210> 109 

<211> 12 

<212> ONA 

<213> Homo sapiens 

<400> 109 

gcggctcggg cc 12 
<210> 110 
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<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 110 

gggcgggccg gg 12 



<210> 111 

<211> 12 

<212> ONA 

<213> Homo sapiens 

<400> 111 

aaaaaagggg gg 12 



<210> 112 

<211> 12 

<212> DNA 

<213> Homo sapiens 



<400> 112 

gcagcggtta gg 12 



<210> 113 
<211> — lg— 



<212> DNA 

<213> Homo sapiens 

<400> 1 13 

ggggaagtag ag 12 



<210> 114 

<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 114 

agagaagtcg ag . 12 



<210> 115 

<211> 12 

<212> DNA 

<213> Homo sapiens 

<400> 115 
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gagagagacg gg 



12 



<210> 116 

<211> 12 

<212> ONA 

<213> Homo sapiens 



<400> 116 
ggggtcagag ag 



12 



<210> 117 

<211> 12 

<212> ONA 

<213> Homo sapiens 



<400> 117 
ggggtggggg ga 



12 



<210> 118 



<211> 12 

<212> ONA 

<213> Homo sapiens 



<400> 118 
caagggggag_ 



42- 



<210> 119 

<211> 90 

<212> PRT 

<213> Saccharomyces cerevisiae 

<400> 119 



Asn Ser Ala Ser Ser Ser Thr Lys Leu Asp Asp Asp Leu Gly Thr Ala 
1 5 10 15 

Ala Ala Val Leu Ser Asn Met Arg Ser Ser Pro Tyr Arg Thr His Asp 
20 25 30 

Lys Pro I le Ser Asn Val Asn Asp Met Asn Asn Thr Asn Ala Leu Gly 
35 40 45 

Val Pro Ala Ser Arg Pro His Ser Ser Ser Phe Pro Ser Lys Gly Val 
50 55 60 
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Leu Arg Pro lie Leu Leu Arg lie His Asn Ser Glu Gin Gin Pro lie 
65 70 75 80 

Phe Glu Ser Asn Asn Ser Thr Ala Cys I le 
85 90 



<210> 120 

<211> 3480 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_RNA 

<222> (2363).. (2363) 

<223> mRNA start site 

<220> 

<221> misc_signal 
<222> (3401).. (3403) 

<223> translation start site 

<400> 120 

gaattctgtg ccctcactcc cctggatccc tgggcaaagc cccagaggga aacacaaaca 60 
ggttgttgta acacaccttg ctgggtacca ccatggagga cagttggctt atgggggtgg 120 



ggggtgcctg gggccacgga gtgactggtg atggctatcc ctccttggaa cccctccagc 180 

ctcctcttag cttcagattt gtttatttgt tttttactaa gacctgctct ttcaggtctg 240 

ttggctcttt taggggctga agaaggccga gttgagaagg gatgcaaggg agggggccag 300 

aatgagccct tagggctcag agcctccatc ctgccccaag atgtctacag cttgtgctcc 360 

tggggtgcta gaggcgcaca aggaggaaag ttagtggctt cccttccata tcccgttcat 420 

cagcctagag catggagccc aggtgaggag gcctgcctgg gagggggccc tgagccagga 480 

aataaacatt tactaactgt acaaagacct tgtccctgct gctggggagc ctgccaagtg 540 

gtggagacag gactagtgca cgaatgatgg aaagggaggg ttggggtggg tgggagccag 600 

cccttttcct cataagggcc ttaggacacc ataccgatgg aactgggggt actggggagg 660 

taacctagca cctccaccaa accacagcaa catgtgctga ggatggggct gactaggtaa 720 

gctccctgga gcgttttggt taaattgagg gaaattgctg cattcccatt ctcagtccat 780 
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gcctccacag 


aggctatgcc 


agctgtaggc 


cagaccctgg caagatctgg 


gtggataatc 


840 


agactgactg 


gcctcagagc 


cccaactttg 


ttccctgggg cagcctggaa 


atagccaggt 


900 


cagaaaccag 


ccaggaattt 


ttccaagctg 


cttcctatat gcaagaatgg 


gatgggggcc 


960 


tttgggagca 


cttagggaag 


atgtggagag 


ttggaggaaa agggggcttg 


gaggtaaggg 


1020 


aggggactgg 


gggaaggata 


ggggagaagc 


tgtgagcctg gagaagtagc 


caagggatcc 


1080 


tgagggaatg 


ggggagctga 


gacgaaaccc 


ccatttctat tcagaagatg 


age tat gag t 


1140 


ctgggcttgg 


gctgatagaa 


gccttggccc 


ctggcctggt gggagctctg 


ggcagctggc 


1200 


ctacagacgt 


tccttagtgc 


tggcgggtag 


gtttgaatca tcacgcaggc 


cctggcctcc 


1260 


acccgccccc 


accagccccc 


tggcctcagt 


tccctggcaa catctggggt 


tgggggggca 


1320 


gcaggaacaa 


gggcctctgt 


ctgcccagct 


gcctccccct ttgggttttg 


ccagactcca 


1380 


cagtgcatac 


gtgggctcca 


acaggtcctc 


ttccctccca gtcactgact 


aaccccggaa 


1440 



ccacacagct tcccgttctc agctccacaa acttggtgcc aaattcttct cccctgggaa 1500 
gcatccctgg acacttccca aaggacccca gtcactccag cctgttggct gccgctcact 1560 
ttgatgtctg caggecagat gagggctcca gatggcacat tgtcagaggg acacactgtg 1620 



gcccctgtgc 


ccagccctgg 


gctctctgta 


catgaagcaa 


ctccagtccc aaatatgtag 


1680 


ctgtttggga 


ggtcagaaat 


agggggtcca 


ggagcaaact 


ccccccaccc cctttccaaa 


1740 


gcccattccc 


tetttageca 


gagceggggt 


gtgeagaegg 


cagtcactag ggggegcteg 


1800 


gccaccacag 


ggaagctggg 


tgaatggagc 


gagcagegtc 


ttcgagagtg aggacgtgtg 


1860 


tgtctgtgtg 


ggtgagtgag 


tgtgtgcgtg 


tggggttgag 


ggtgttggag eggggagaag 


1920 


gecaggggtc 


actccaggat 


tccaacagat 


ctgtgtgtcc 


ctctccccac ccgtccctgt 


1980 


ccggctctcc 


gccttcccct 


gcccccttca 


atattcctag 


caaagaggga acggctctca 


2040 


ggccctgtcc 


gcacgtaacc 


tcactttcct 


gctccctcct 


cgccaatgcc ccgcgggcgc 


2100 


gtgtctctgg 


acagagtttc 


egggggegga 


tgggtaattt 


tcaggctgtg aaccttggtg 


2160 


ggggtcgagc 


ttccccttca 


ttgcggcggg 


ctgcgggcca 


ggcttcactg ggcgtccgca 


2220 


gagcccgggc 


ccgagccgcg 


tgtggagggg 


ctgaggctcg 


cctgtccccg ccccccgggg 


2280 
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cgggccgggg gcggggtccc ggcggggcgg agccatgcgc cccccccttt tttttttaaa 2340 

agtcggctgg tagcggggag gatcgcggag gcttggggca gccgggtagc tcggaggtcg 2400 

tggcgctggg ggctagcacc agcgctctgt cgggaggcgc agcggttagg tggaccggtc 2460 

agcggactca ccggccaggg cgctcggtgc tggaatttga tattcattga tccgggtttt 2520 

aiccctcttc ttttttctta aacatttttt tttaaaactg tattgtttct cgttttaatt 2580 

tatttttgct tgccattccc cacttgaatc gggccgacgg cttggggaga ttgctctact 2640 

tccccaaatc actgtggatt ttggaaacca gcagaaagag gaaagaggta gcaagagctc 2700 

cagagagaag tcgaggaaga gagagacggg gtcagagaga gcgcgcgggc gtgcgagcag 2760 

cgaaagcgac aggggcaaag tgagtgacct gcttttgggg gtgaccgccg gagcgcggcg 2820 

tgagccctcc cccttgggat cccgcagctg accagtcgcg ctgacggaca gacagacaga 2880 

caccgccccc agccccagct accacctcct ccccggccgg cggcggacag tggacgcggc 2940 



ggcgagccgc gggcaggggc cggagcccgc gcccggaggc ggggtggagg gggtcggggc 3000 
tcgcggcgtc gcactgaaac ttttcgtcca acttctgggc tgttctcgct tcggaggagc 3060 
cgtggtccgc gcgggggaag ccgagccgag cggagccgcg agaagtgcta gctcgggccg 3120 



ggaggagccg cagccggagg agggggagga ggaagaagag aaggaagagg agagggggcc 3180 

gcagtggcga ctcggcgctc ggaagccggg ctcatggacg ggtgaggcgg cggtgtgcgc 3240 

agacagtgct ccagccgcgc gcgctcccca ggccctggcc cgggcctcgg gccggggagg 3300 

aagagtagct cgccgaggcg ccgaggagag cgggccgccc cacagcccga gccggagagg 3360 

gagcgcgagc cgcgccggcc ccggtcgggc ctccgaaacc atgaactttc tgctgtcttg 3420 

ggtgcattgg agccttgcct tgctgctcta cctccaccat gccaaggtaa gcggtcgtgc 3480 



<210> 121 

<21 1> 8024 

<212> ONA 

<213> Homo sapiens 



<220> 

<221> misc_Jeature 
<222> (3731).. (3731) 
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<223> mRNA start site 
<220> 

<221> misc_feature 

<222> (3959).. (3961) 

<223> ATG 

<400> 121 

ccgggctgag ctcagtcatt ttgccctgag gactataagt ggactattat gcagcacttt 60 

cttttttatt attattacta ttaagccaag taagttctta acagctaaca cctgagctgg 120 

tggctctgag aagcctcttc actccttcac gggagacggg accattcaca tgaagatcct 180 

acattgttgt tttttttttt ttggaggtcg aaaaaggtca ctgttaggag gctttctggg 240 

cctttgctcc tctccctcaa tttattaccc ctccagtggc tgatgacgta cagggagact 300 

tccacccgat aatgacatgg ctttgtttat ttcacaaatt cccagcattt actgttaatc 360 

agacccagtt tgaaccaccc ccaaggggct tgcagtctaa acagctcact ttgctcagcc 420 

tcttcctgag gtcaggcact gtcttgctaa ggccgacatc agctcatgcc cattttacag 480 

atggggaaac tgagaatgct aagaagtgaa atagcgtaag gttatacaac taacagggag 540 

acagcctaaa cttgaaccca accggaagcc caacatggcc ccaagccttc ctcgaacccc 600 



aggacttggc aaagcgggcg tcctggggta aagcatggca gaagggcttt gggtccaagc 660 

taagtgaggg tcctgtttct agatcacctg gccaggtgca gtggctcatg cctgtaatcc 720 

cagcactttg ggaggctgag gcgggaggat tgcttgagct caaaagtttg agtccagccc 780 

gggcaataca gcgagacctc gtctctacta aaaaagaaaa caaaaaatta gctgagtgtg 840 

tagtcccagc tactcaggag actgaggctg gaggattgct taagcctgga agtttgaggc 900 

tgtagagcta tgatagagcc actgcacttt agcctgggca atggagcaag atactatctc 960 

aaaaaaaaaa aaatatatat ataggtcccc ttgtccctct gctgagaagt aaccagatct 1020 

ggaaaagatt tagtcacctt ggtccaacta tttctttcac ataaagaaaa aaaaaggcaa 1080 

tgcagacctt cccatggggg cagctctgcc tgaggccttt gcaggtacct ctgtttgtct 1140 

gccccggggc acagtggcag attgggcagg gcagcttgca gtgaggattg ctgatggatg 1200 

agctcctagt gtacctagcc agccatttac tcacaaacag ctattgagca cctactatgt 1260 
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gcccagcact ggaggtacaa ctggcaacaa cacaaatccg ggcttgctcc atggaggtga 1320 

caatctaaat gcggtggagg gtcagctaac aagtgcagaa ggttctctta agagctcaaa 1380 

gaagctccaa ccagaaggac tgggcagggg atccagaagg catccccgag tggctactcc 1440 

aatggagtgg cttctccatt caggcaaacc tgaatgggat aagtcattgg caggaagatc 1500 

tggggccggg ggtcatccag tgggaagggg agagatgacg cggtcagcat ggcgggaaca 1560 

caggagcaga aaggaagcag gtgggaagcc aggtcaaggg ccaggggcac ggaaaggggt 1620 

cagatgcaga taagtgagtg cttcctggtg catccttcat ccgcaattca tccttacctg 1680 

tgcttttgtt gcctccattg cacagctgag gaggccaggg cctgcggagg ttgagagtgt 1740 

gctcagggag cccccggagc aaagtggaag ccagattcca gatcagttct gctgggaatt 1800 

cccagctccc aaaagccctg ctggctgtca gtccccagtc accacaagca cctatcctgt 1860 

Qtgggtgggc ctgcagttct gggagatata tcagctgcct gcagcgtcct ttgctgaact 1920 



cacagcaaat aggagagaca gggaggggtc cttgggaagc cctaaattga gcttgctgtg 1980 
ggagtcctgg gaagaaagga gcctcatcct atcaaaagcc ggggggaaga catcagagtc 2040 
cc t c t gc t ca gg t cagctgg cacagg tggg tctccag gcc tgggtctcac ttcc ccagag _ 2100 



ggtgtgttcg 


ggtggcccca 


ggctgaggga 


ggaaagccca 


cctcccatgt 


cattttgcaa 


2160 


atggggagtc 


agggacctag 


agatggaaag 


acaacacagc 


aagtgaggga 


tgggttctag 


2220 


gtcccctgca 


ccctgcaccc 


tgcaccctgg 


ccaacgatgt 


ctatttggca 


ccagatctgc 


2280 


aggctcatct 


gggggacccc 


aggacccaga 


ggcagccggg 


ttgcatctcg 


aagctgtgag 


2340 


ctgcagccca 


ggaaggtcca 


ggtctgggtg 


gcgctgccca 


agcaggctgc 


aggcccaagg 


2400 


aggaacaaag 


atcctctcaa 


ggggtgcgga 


gctgaggttc 


cggtcctgcc 


aaagccactt 


2460 


gatgaccccc 


aagtgccccc 


ctttctgcac 


ctcagagaag 


agccctcaag 


cctcccaggt 


2520 


cccctccagg 


ggcacgaata 


agccccagca 


gggttctgaa 


ggggtcccag 


gaatctccct 


2580 


gtggggatgc 


ggtggaggtg 


gaggaggctg 


cggtggcctg 


gggacatctc 


tggtcacagg 


2640 


tgctggtggt 


atgagagatg 


gggtaggcac 


caagccccct 


gcagctgtgg 


ctaggcgggc 


2700 


ctgcaggaag 


ggccaggcag 


gctcctcagg 


gaccacaaag 


aacaggggt t 


ttcacaccta 


2760 
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ggtgggcctg catctagcta ggccagtccc catcaggcca taatgggcac agtgggaggt 2820 

agaaccatga gtgagagagg ggaggcttcc agaggcctgg cctgggtccc tgctagattg 2880 

agggctctgg ctatggtaca tggatatttc tgctgtggaa tcaaaggagc aggggatgct 2940 

gaatatcccc tctggcccta tgccctgcta cctgtccttt cacggaaggg tgtgtgtgta 3000 

oggggtgcag gaccaggcct ccctgggtgc atctctgcca ccttgccctt tggctcaggt 3060 

ggacctccac caggtattca gaactccagc ccagaaacgc gccaagcctg tggggccaag 3120 

acctaggggg tgggggtggc ctccctcccg cctgtagcca aagggtcctc ccttgcccag 3180 

ccaggccccg gtgtcgctta ctgctcttat ccacccctcc ttcccaggcc ggtcctcaag 3240 

gccccagcaa aggaaccaag ttcccgtgag cctccgaaag gcgaagggca ggcagcagcc 3300 

gctggcttct gcgcccacta ggagcttcgg atgcccgagt tagggctgcg ccaaggcggc 3360 

cggagcagag agggagacgg ggacggggac aggcagggac aaagtgcaag aggcaaaact 3420 



ggctgaaaag cagaagtgta ggagccgcca aggggcggga cgaacaggtc cgtgggccgg 3480 
gcggagccaa gggtgggggc cggggtccct ccaggtggca ctcgcggcgc tagtccccag 3540 
cctcc tccct tcccccggcc ctgattggca ggcggcctgc gaccagccgc gaacgccaca 3600 



gcgccccggg cgcccaggag aacgcgaacg gccccccgcg ggagcgggcg agtaggaggg 3660 

ggcgccgggc tatatatata gcggctcggc ctcgggcggg cctggcgctc agggaggcgc 3720 

gcactgctcc tcagagtccc agctccagcc gcgcgctttc cgcccggctc gccgctccat 3780 

gcagccgggg tagagcccgg cgcccggggg ccccgtcgct tgcctcccgc acctcctcgg 3840 

ttgcgcactc ccgcccgagg tcggccgtgc gctcccgcgg gccgccacag gcgcagctct 3900 

gccccccagc ttcccgggcg cactgaccgc ctgaccgacg cacggccctc gggccgggat 3960 

gtcggggccc gggacggccg cggtagcgct gctcccggcg gtcctgctgg ccttgctggc 4020 

gcectgggcg ggccgagggg gcgccgccgc acccactgca cccaacggca cgctggaggc 4080 

cgagctggag cgccgctggg agagcctggt ggcgctctcg ttggcgcgcc tgccggtggc 4140 

agcgcagccc aaggaggcgg ccgtccagag cggcgccggc gactacctgc tgggcatcaa 4200 

gcggctgcgg cggctctact gcaacgtggg catcggcttc cacctccagg cgctccccga 4260 



56 



WO 2004/053130 



PCT/KR2003/002693 



cggccgcatc ggcggcgcgc acgcggacac ccgcgacagt gagtggcgcg gccaggcgcg 4320 

aaggggcggg ggcggggggc aacggccgcc gggccaaccc gctcagtcac actctgagac 4380 

cctcggcggg cacctgctcg ggggccccgg gaaccggggc ggactcgggc tccggtccct 4440 

tctgacgcgg ggctggggac gcagacactc ttggctccgg cagcccagcg caacccctga 4500 

ggtcgggcgc cgcctcccgc cttcagaaac tcgggctccg agcgccgaat tccagcgcct 4560 

tcgcccgtgg gcacagggcg cgcggtgcag ccacaggggg cccgagacac gcgccccggc 4620 

ctggcccagg ctggggaacc gctggggtcg ggctcgcgtc tgaaggtccg ggactgggtg 4680 

cggccgccgg gggtccccta cacaggcaag ctaatctgag ctagcgcagg cttgggctcc 4740 

ggaggcccta gagggcagct tgggctctgg aggcccttgg gggcggctgc gccgggaacc 4800 

ctggcccttt atccccaacc ccaccccaga aatagggtcc ccggaggcga acaagccgag 4860 

gggcggagtg ggccagggat cacctgcccc gcaatgacct gcgccccgcc cccaggcctg 4920 



ctggagctct cgcccgtgga gcggggcgtg gtgagcatct tcggcgtggc cagccggttc 4980 
ttcgtggcca tgagcagcaa gggcaagctc tatggctcgg tgagtaccgc aggggtctgg 5040 
ctaggcacct agttgggaac agcggacatg gctagcaggc tcgtggcttc tccagcccca 5100 



cctgtgcctg ggtcttggag gggtggcagg gtcaccaggt cacgggaccg gcaggcctcc 5160 

ccagacaaag gaagcagccc caaggcagga acaatgaggt tcctgccatc cctgagtggg 5220 

cccctcccag accgaggaaa gggcgctatt gagagccctt cccttctcta gtccagaggg 5280 

gtaggtctca gtgttggaac tgcgggcttg aggctggaca cgcagggaat gaattctctg 5340 

gctgctaggt gcagggcagg tggtgagagc accagctgtt gtgggctggc catgtcccct 5400 

tctcaccctg tgtgggtctt gacaccttaa ctgctcagca gagacatctc agcccagggt 5460 

ggggggtggg acagaagggg gttctgaccc ctggcttcag gctgggtacc ttgcccaaga 5520 

ggtgccccag ccctgacact gccctgcttt gctgcagccc ttcttcaccg atgagtgcac 5580 

gttcaaggag attctccttc ccaacaacta caacgcctac gagtcctaca agtaccccgg 5640 

catgttcatc gccctgagca agaatgggaa gaccaagaag gggaaccgag tgtcgcccac 5700 

catgaaggtc acccacttcc tccccaggct gtgaccctcc agaggaccct tgcctcagcc 5760 
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tcgggaagcc cctgggaggg cagtgccgag ggtcaccttg gtgcactttc ttcggatgaa 5820 

gagtttaatg caagagtagg tgtaagatat ttaaattaat tatttaaatg tgtatatatt 5880 

gccaccaaat tatttatagt tctgcgggtg tgttttttaa ttttctgggg ggaaaaaaag 5940 

acaaaacaaa aaaccaactc tgacttttct ggtgcaacag tggagaatct taccattgga 6000 

tttctttaac ttgtcaaaag ttgtcacgag tgtgctgcta ttctgtgttt taaaaaaagg 6060 

tgacattgga ttccgatgtc atcccctgta gtatggcgtg gagcatctct gtctggaaag 6120 

gcccgcctga ggcttgggca gccagttcag ggagctccca ggcttggctc tcggctagca 6180 

tcctcagagg cccactccct ttgtgccctg ttgctattaa tcgggacata tcggtttact 6240 

tcgggtacag aaagtgcggt gttgaagtcc tcgctgccac tctgttttta gatctgccaa 6300 

gactgacctt tgaactttcc tgtagtcaat cttcctcgat ctaccagatg ggagagacec 6360 

ttggacaact ttataaactc ctgtttgcct tttttggatc agcgacagcc cccatcgctg 6420 



tgactattgg ggaaaagacg aagctctttc ataaattcca tggagaggaa tcaatatccc 6480 
actggaaggc tagaaatgga caagatagtg tatttgcaat cacaaacaaa accctagtga 6540 
tgaaaaataa tttgtgatgg cagatgcttc tgatggtgtg atagaatatg tttttgaaaa 6600 



caaaccatcg aaccccccgc cccaccccca aaacgggctt ccctgtgttt agggagcttt 6660 

gggctagaac tagctacgat ttttaggtga aatgtccttg taattgtaca aagcacttgg 6720 

tgcagtgttt gcgtggagca gcctgctgct ttctgatgca ttccctgttt aagtgcgttt 6780 

aacatctacc tcacaagccc tgaaacccca ggcaaaaccc acagaaagct catacccggt 6840 

gcaggagttt gccatcccaa gtggcttttt ttccatatgt agccaaaaag gattgcagat 6900 

agcgtcggtg cgtcccattc gaaccttgtc acgtttgagc tatctttacc ctgtgattta 6960 

cttttagtaa gggtgatcat ggtgaaaata tttgcagaca gctgttacag tacactatat 7020 

ggtcaccaag taaccttata tttttcttta tatattttac aaatgtaacc cctgtcattg 7080 

aagcaaccgt ggaagaggca gggtcggtga tgtttaaaaa aagttccgag gtgatggcaa 7140 

acatttaatt ttaatgaatg actttttaga gtttatacaa aatgacctta gcttgctacc 7200 

agaaatgctc cgaatgtttc gtcaagactt taatactctc ctaggatgtt tctgaactgt 7260 
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ctcccgaatt aactttatgg gagtctacag acagcaagac tggaaaatct gattggagft 7320 

tttgtctttc acattccttt tgaaaactct ttgttcgaat gcaaatcatc gacttaaaat 7380 

actattctta accaaggcct ggaagaaaga agacacttgc aaagccgcta agacaggacc 7440 

acacatctta aactgctgtt cctaccatgc actaaactgt ttttaagttt taaaccacac 7500 

cctaggctcc aggagtgttc aggaaagatg gtgtttgtag gtctccatgc tgtttggcgt 7560 

tggggggtgt ggagggatca tccgtcgact ttctgaattt taatgtattc acttagtaac 7620 

aaaccatgat tgtcttaaat gccttaaatt attatgagat ttcttgtctc agagcccaat 7680 

cagattgtca ggaattaaca tgtgttaggt ttgatcaccc ttgaccactt cttatagata 7740 

1 1 tct tcaac aaatcatgtg tgatgcctgt aggaacacaa ctgtacct 1 1 aaaatat tgt 7800 

tttcatattg ctgtgatggg gattcgaggt tcctgtatgt gccactgttt tcagaatctg 7860 

tagttttata caggtgccga ccctcgttgt gatgtatgtg ctgtgcacat tgacatgctg 7920 

accgacaatg ataagcgttt atcgtgtata aaaagacacc actggactgg atgtacacaa 7980 

ctgggaaagg aattaaaagc tattaaaatt gtgccttgaa atgc 8024 

<2 1 0> -482 



<211> 7000 

<212> ONA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<222> (4389) .. (4389) 

<223> mRNA start site 

<220> 

<221> misc_feature 

<222> (4454).. (4456) 

<223> ATG 



<400> 122 

aatggtatta tagggtaatg agtatccatc tagtatttaa gtatttacat aaattgcagt 60 

acttaaagta atctctttac aagttatttt atcaaaaact tttcagacac aattttttgg 120 

ggatttattc aaactgttta acacttaaga agtactggct taccttggag atactgctcg 180 
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tttggtttca gaccactgtg atcaagcaaa aatcgcaata aagcaagtta catgaatttt 240 

tttttcgttt cccagtgcat ataaaagtta .cacagcagac tattaagtgt gcaacagcat 300 

tatgtttaaa aatgtccata ccttaactta aaaatacttt attgttaaaa aatgctaacg 360 

atcatataag ccttcagcga gtgataatct ttttgctgat ggagggtctt gcttgatgtt 420 

cagagccttg ctgtggcttt ggcttaaggc ttaagggaat attgcagctg gtttgatctt 480 

ctatctagac tgctcaaatt ttctgcatat cagcaataag gctgctctgc tctcttatca 540 

tttgtgtgtt cactggagta gcacttctaa cttgcttcaa gaacttttct tttgcatttg 600 

caactcggat aactggtgca agaggactgg cttttgacct aactcatctt tgggcatgcc 660 

tttccccaaa agcttaattt atttctagct tttgatttca aggaagagac gcgcaactct 720 

tcctttcact tgagtactta gaggtcattg cagggctatc aattggccta atttcaataa 780 

tgttgtgttt taggaaatag agaagcctga ggggagggag agagacgggt gaacagctcg 840 



tcagtggagt agtcagaata cacacatgaa tggattaagt ttgggttgtg gtttgtggtg 900 
cccaaaacaa ttatggcagt aacatcaaag atcactgatc acagatcatc atgtaaaata 960 
ataaggaaat atttgaaata ttgcaagaat taccaaaatg tgacacggag acacaaagtg 1020 



agcacatgct gtgggaaaaa cggcaccaac agacttgctc aattcgagga caccacaaaa 1080 

cttaatttgt aaaaacacat tatctgtgaa gtacaataaa gtgaagggca ataaaatgat 1140 

gtatgcctat gtaaggcaat cagtagatga tgggaaaaaa acattgcatg atttagaaaa 1200 

aacaaagaga atatgttatc aaaatgacta aactaatagc ataattagaa tttcatttga 1260 

gtatttcttt atagttttga gagatttaaa attatgtatt attttataaa ttattatgga 1320 

ggatctccta tatacccagt ctcagactta ttttggtgat tatactctgg aacatgtgat 1380 

tcttctcctc gtggggttaa aaaaatttat accatcctat ggggtatgac taatctgaat 1440 

ctcacacttg aatattactt tgggatctta ggcaagttat ttaagaataa aaataactta 1500 

ctatgtttcc tcaactataa aatgagaatt ttaataatct taaacttact gtaaggatga 1560 

aataattttc aatagtatgt aatatgatgc ttagcataca ttaagatctc agtgtatatt 1620 

agcaacaatt tcagtaaaga aagaccaaat aatttttgtc aagaaatatg aatatataaa 1680 
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ttatataggt tttaagttgt atttaccata tttaatgtga cagtaaaaaa agtcacgaaa 1740 

atgtgtgacc taataagttt attcagtttt ctaatgtcct gaacccctta tctcagatgg 1800 

attttgctcc aaacttataa caataattta caaccctgac tctagttttt ttttctgaga 1860 

gaaaaaaata aatagaaaca ctgttctttt tctttcctta cctacaggaa tttacttaca 1920 

gaaaaatcta acttctttta aaaacagcct taatcccttg ttgggccaag ggaaaacttt 1980 

tccattgttc tctgaaggtt tgctaaaaaa aaaattactg tcaagaggca gatcaataga 2040 

agaaaaggca tacacattta tttgatcata attttacaca acccgagagc ctttagaaca 2100 

aagacccaaa gttacaaaag aaattgtcca tttttatgct taggttcaac aaagtgtggg 2160 

caggtgtgga gaaatacaac tggacaaaag gaatatgatc tcatgctaac agactgagtg 2220 

gggacgcctg gcaaggtgag attcttcctg gtatctctgt gcagtactca ttccttctgg 2280 

gtatggggca ggaccttctt tggaatgggg tcttatgagc tacgatcaaa caaggtaggt 2340 

cagataatgt ctttatggcc agatttcaca cagaaagttg aggtgttaga gtgatatgct 2400 

taggttttat ggctggtttg ggaaaaaggg ttctggtttc taggagccac cttgggaaag 2460 

agggattcta gttt ctatgc ctcgccttgg gggagaatga agggccggag actggagagc 2520 

aggagaaggt cagagagagc tgattctgag gtcttcattt ggggtatcat ttttctgagc 2580 

ccctacaccc taataaagca caagagatgc agtggagcaa ttcagggtca cggtcaggct 2640 

atgcattgaa ctgagatttc ccaaaaagtc tactgaacag taaaaagaaa gtaaaatgga 2700 

tcctggggac accagacaga ggctgacaaa tgatttttaa gtaaggagaa aatgataaaa 2760 

gagaaggatt agcaatagaa acgggtcata taaaatagat ccctcaaaag gaattctctt 2820 

aatccctagc ttctctagat atcccacaac ctcagggact tatcaggcag gttgtttttc 2880 

cctgaaagtg ggggtaaggg agctggagga caaatgaagg tggtatgtgg agggaaggct 2940 

gttctgtgga tgagtttaat tcagccccac aatcacttct gtacagctac ccaccgctct 3000 

agtcattccc acatttggcc tgctttcttt tcctctgtgg acaggggcac tgttctctac 3060 

taatatccat ctcagagaga tacaggggca agtatccctc agcatccatt agaaataaag 3120 

caggctcttg cttaaagtta ccagagcatc cacctctggg tgcaaagaca aattctctga 3180 
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atcaagtgag gggtctgggc aatgatctca caaggatttg atacctagga gtccccccat 3240 

gcccatacaa gctcctcatc tttccactta cactttggga agctggctgt cgtgtacagg 3300 

cagatgaagc tggaaaagag aggcatattc agtactcacg aattcaaaca gcttgaggga 3360 

tttccggtga aagtcagtcc taaccagtgt atacgtacat acacaccaac atgtgtgaat 3420 

gtgttgtgtg cacgtgtgtg cctgtacaag tccacatggc atatttacct gtcagggaca 3480 

ggctatggac aatgactgtt tcttggactt tctcttaaaa agtcagatca gacaagttta 3540 

ttttgtatac tttgggtaaa tgtgtggtat ttcgtgagtt tggcagtttg tgaaaaaaaa 3600 

aaaaaaaaaa aaaaaaaaaa aaagctgcct gctctgagcc catggggcag gggcaatttt 3660 

ttcatctgac aatctgcgtg cttttgtttt gcttgcttat tttggcccca caataccaca 3720 

cccttttctt aactaacctc tttctacctg ggctggacgt gcctgggctc tcctccctgg 3780 

ccccgctccc acctctccca ggtctctaaa cccctagaga acctgtgtca gtgttttgaa 3840 

tccctcagtt gctctagcag gaaaactaga cagattagga gctggggcac atttggctga 3900 

aagacagctc ttcgctttct tcttatgctg cttccccttc ctcttttccc aaatagatat 3960 

ataaacacat gtattttcct gtttaaattg agcgaattgg tcccctgcct gtgccttgat 4020 

ttagccattg ggctcagcct tgctcctccc ttccttactc ggataggagc cactgggatc 4080 

tggagctcca gcttccaaat tgaagctggc ctcaggccag gtgacctttt ctttgtaagt 4140 

ttctttccta agcgtggggt tggggggagg cggggaatgg ggggggttgc agggatctgt 4200 

ttggtgctgt tgaagggggg gcgagtgagg aaaggagggg gctggaagag agtaaagggc 4260 

tgttgttaaa cagtttctta ccgtaagagg gagttcagac ctagatcttt ccagttaatc 4320 

acacaacaaa cttagctcat cgcaataaaa agcagctcag agccgactgg ctcttttagg 4380 

cactgactcc gaacaggatt ctttcaccca ggcatctcct ccagagggat ccgccagccc 4440 

gtccagcagc accatgtggg tgaccaaact cctgccagcc ctgctgctgc agcatgtcct 4500 

cctgcatctc ctcctgctcc ccatcgccat cccctatgca ggttagttcc cttcttcttc 4560 

ttcattatta gtattagtat ttaactctcc tgctaacctt ccctattcct tttaacaccc 4620 

tctttttacc ctattcccag catcctttct gaactcagta tgtagtatag gtttctaaaa 4680 
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9o,c,oa„. ,„c ttm „ oacattottl tnottattfl tnflaatagc attuaMto ^ 
ataa.taao. Uccc.caac .cccc.ccac c.ccaaccca agccccgtcc cacUagcc, 4800 
aa.aoUg.g gat.a.gaga taggaaggaa g.gc.aa.ac tooot.aao. ,ggc,g cm 4860 

ggacaag... aaagc.aaa, ag^to.g g.c.gaawg gcaagag.ga .gg.cag.cc 4820 

»0caggaag, ca,cc.«„c oagagaaoaa .m.ca.ga .aa.gcac.a c.ccaca.ca 4980 

cc.ag.caac att.a^cc aaa.lacgac ...g.acagg «, lt0 a t „, gaggagpcag 5040 

aa.aaac.ct gagta.Ugc a.a.ca.aaa aa.gaaagag aaagcc.c. u.aaaga.c 5.00 

Ifttottt. .ggg.acgga .gcc.gccc. ««gaaac,gc ag.gcacgg* gac.gau 5,60 

aaagcgcag aac.gccca. Cc.g.Ccc caC.c.cc c<(„ga«„ 0 M 

*oao..gc. .gaaaguca «a t .gc„„g aga.tta^g a,c.cg,„g cgc.Cggg sao 

«Wtctct tg.ta.cagg ac!amima acatct0 , a , ^ 

»c. 9 agg,gc caacgggaga aggcag.gaa .a.caa^g, aggcgcagg, gaa.aaaaga 5400 

gtgggaacaa a.gcccaoa, ggagaca.gg ccmuaca a.a.aaaaaa (agaacggc 5460 

Wttotttt gaga.gg.aa a.a.gaca .. .a.cagacc. ..ga.Cag. t ., toa , a , n m 

OUcaaggg. taaaaaactc aaoaattUc ^ 



.«....«. „a«„.„,g aaotw:cco , U00aaaat 9acactotu goaaaaMt(! ^ 

accgaaag ca.Uagg.a agamcga agaag.gaaa aagcag.gag „ caaatcaa 5700 

acaggUa.c a.gc.gaca ,g t g, catat taaaatooc( tcacWc ^ 

OCcacgcc. g.aa.cccag cacU.ggga ggccgaggcg ggcaga.cac 0 agg tMM a ^ 

°a..gagacc a.cc.agc.a acaagg.gaa accc.g.c.c tac.aaaaa, acaaaaaa,, 588O 

aoccaggcg. gg.ggcaggc acc.g.ag.c ccacc.ac, gggaggc.ga ggc^ 5940 

.c.C.gaac c.gggagg.g gggg,,^ tgag<xgaga ^ 

ctggggaacg gagcaagac, cca.c.caag aagaagaaga aaaaaa.gc, tcacaga.ga 6060 

otgcgg... agggga.m gagcttaaat toaaatMtg ^ 

ca.U.taaa ga.taaaa.g tcac.gUc, .aag.agaa, c.ggl.acc. oaat.ca.c. 6,80 
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Qtgctaacgc aaggggaaco cagtgtggaa aacccaaaca gtagatcaac cgtaggcagt 6240 

gtctatttgt tttcggcatg cattatgaac ttttggcagg agacatacat ttgtaattat 6300 

atttcacttt gcctaatgta gaaatgactg tgtttcctga gtacaggcag aatgcagccc 6360 

aagagtgctg ocaggcaagg agagtccagt tgggaattac aaatatgctg tgaataattc 6420 

ctgaagtgga taattctaaa attgtcatca aaggagggtg cgcctttgtt tagatggcca 6480 

gtttgatagt tttttttaat aacctttaaa ataaaaaata tgggtagcct cttagaacac 6540 

acaaagtttg ttctttttta aatgacattt aatattgact atttagaggt ttcttttgtt 6600 

gttactagct ttgattataa ttatttattc tatgaattta tatttgtatg tattgtaaaa 6660 
taacacattg ttaggaaaga agtatatact gtaagttgac aaccagttat caacagaata 6720 

cactatggag atactttttt aaaagcttaa gaaatattca atataatggg cccccgccat 6780 

ctttgtagga gttagcctat atagaattac cctctattca ctcccaccta catgggaaac 6840 



aaatatccaa tcctctgtaa taaaagaagc attaaatgag cacctaatat tcaagagtat 6900 
gtgggggatg taaagatgaa caaataagaa aggaacttaa atttgttgag caactgatat 6960 
gaaccaagta gtaaagtaca tctcacttaa ttctaataag 7000 



<210> 123 

<211> 21 

<212> PRT 

<213> Artificial 

<220> 

<223> zinc finger consensus 



<220> 

<221> MISC_FEA7URE 

<222> (2).. (2) 

<223> any amino acid 

<220> 

<221> MISC_FEATURE 

<222> (3).. (3) 

<223> between 1 and 4 amino acids of any amino acid 
<220> 

<221> MISCJ=EA7URE 
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<222> (5).. (5) 

<223> any amino acid 

<220> 

<221> MISC_FEATURE 

<222> (6).. (6) 

<223> any amino acid 

<220> 

<221> MISC_FEATURE 

<222> (7).. (7) 

<223> any amino acid 

<220> 

<221> MISC_FEATURE 

<222> (8).. (8) 

<223> any amino acid, often aromatic 
<220> 

<221> MISC_FEATURE 

<222> (9).. (9) 

<223> — any amino acid '■ 

<220> 

<221> MISC_FEATURE 

<222> (10). .(10) 

<223> any amino acid 



<220> 

<221> M I SCLFEATURE 

<222> (11). .(11) 

<223> any amino acid 

<220> 

<221> Ml SCLFEATURE 

<222> (12). .(12) 

<223> any amino acid 

<220> 

<221> M I SCLFEATURE 

<222> (13). .(13) 

<223> any amino acid 

<220> 

<221> MISC_FEATURE 

<222> (14).. (14) 

<223> any amino acid, often hydrophobic 
<220> 

<221> Ml SCLFEATURE 
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<222> (15). .(15) 

<223> any amino acid 

<220> 

<221> MISG_FEATURE 

<222> (16). .(16) 

<223> any amino acid 

<220> 

<221> MISCLFEATURE 

<222> (18). .(18) 

<223> any amino acid 

<220> 

<221> M I SC_FEATURE 

<222> (19). .(19) 

<223> any amino acid 

<220> 

<221> MISCLFEATURE 

<222> (20).. (20) 

<223> b e tw ee n on e and thr ee residues of a n y amino acid 

<400> 123 

Cys Xaa Xaa Cys Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 

1 5 10 15 



His Xaa Xaa Xaa His 
20 



<210> 124 

<211> 21 

<212> PRT 

<213> Artificial 

<220> 

<223> RDER Motif for a zinc finger domain 
<220> 

<221> misc_feature 

<222> (2).. (2) 

<223> any amino acid 

<220> 

<221> misc_feature 

<222> (3).. (3) 

<223> between 1 to 4 residues of any amino acid 
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<220> 

<221> misoJeature 

<222> (5).. (7) 

<223> any amino acid 

<220> 

<221> misc_feature 

<222> (8).. (8) 

<223> any amino acid, frequently aromatic 
<220> 

<221> misoJeature 

<222> (9).. (9) 

<223> any amino acid 

<220> 

<221> misc_feature 

<222> (11). .(11) 

<223> any amino acid 

<220> 

<221> misc feature 

<222> (14). .(14) 

<223> any amino acid, typically hydrophobic 
<220> 

<221> misc_feature 

<222> (1 5).. (15) - 



<223> any amino acid 
<220> 

<221> misc_feature 

<222> (18). .(18) 

<223> any amino acid 

<220> 

<221> misc_feature 

<222> (19). .(19) 

<223> any amino acid 

<220> 

<221> misc_feature 

<222> (20).. (20) 

<223> between 1 and 3 residues of any ami no acid 

<400> 124 

Cys Xaa Xaa Cys Xaa Xaa Xaa Xaa Xaa Arg Xaa Asp Glu Xaa Xaa Arg 
15 10 15 
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His Xaa Xaa Xaa His 
20 



<210> 125 

<211> 6 

<212> PRT 

<213> Artificial 

<220> 

<223> exemplary linker consensus 



<220> 

<221> misc_feature 

<222> (3).. (3) 

<223> Glu or Gin 

<220> 

<221> misc_feature 

<222> (4).. (4) 

<P23> Arg or Lys 

<220> 

<221> misc_feature 

<222> (6).. (6) 

<223> Tyr or Phe 



<400> 125 

Thr Gly Xaa Xaa Pro Xaa 
1 5 



<210> 126 

<211> 30 

<212> PRT 

<213> Artificial 

<220> 

<223> Exemplary N-terminal sequences 
<400> 126 

Met Val Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Glu Leu Pro Pro Lys 
1 5 10 15 

Lys Lys Arg Lys Val Gly lie Arg lie Pro Gly Glu Lys Pro 
20 25 - 30 
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